The statements in this section merely provide background information related to the present disclosure and may not constitute the prior art.
As multimedia technologies are developed in rapid progress, demand is growing for quality multimedia data including audio, image, and video. Accordingly, for meeting the demand to transmit, store, and retrieve such multimedia data within a limited network environment, international standards are being established for high efficiency video compression. Specifically, in case of videos, ISO/IEC JTC1/SC29 MPEG group and ITU-T VCEG group have created H.264/AVC MPEG-4 Part.10 standard, which attempts to achieve a high compression efficiency by using various prediction encoding methods such as variable block size motion estimation and compensation, intra prediction encoding, etc. The prediction encoding is an effective method to reduce correlations in existence between data and it is widely used for compressing various types of data. Especially, because motion vectors have high correlations with motion vectors of adjacent blocks, it is possible to first calculate a prediction value or predicted motion vector (PMV) for a motion vector of a current block from motion vectors of adjacent blocks and then encode not the true values of the motion vectors of the current block but just a differential value or differential motion vector (DMV) relative to the prediction value and thereby substantially reduce the bit quantity to improve the coding efficiency.
Generally, in efforts towards an effective compression for the encoding of a motion vector using such a predicted motion vector, more accurately predicted motion vectors proportionally improve the coding efficiency. Therefore, a possible way of improving the efficiency of the predicted encoding is to generate a finite number of predicted motion vector targets comprised of not only the motion vectors of just the spatially adjacent blocks but also motion vectors of temporally, spatially, or spatio-temporally adjacent blocks or their combined calculations into further motion vectors and to select among the generated motion vector targets the most appropriate one for the prediction encoding of the motion vectors. In this occasion, to correctly reconstruct the original motion vector from the prediction based encoded motion vector data, it is necessary to know which one of the finite number of the predicted motion vectors was used. The simplest motion vector prediction encoding method for the task is to additionally encode information on the correct predicted value used to perform the prediction encoding of the motion vectors. Alternatively, to reduce the bit quantity required to encode additional information for indicating such selection of the predicted motion vector, the current H.264/AVC standard uses medians of respective horizontal components and vertical components of the motion vectors contained in the adjacent blocks (at left, upper, and upper right sides of the current block) as the predicted motion vectors (PMV) for the prediction encoding of the motion vectors. This method determines a predetermined default method in the form of a median, commonly recognized in the encoding and decoding operations and produces the prediction value (predicted motion vector) using the default method and thereby obviates the need for additionally encoding information on the used prediction value. The conventional method of predefining the default method of the median for use is only as good as saving an additional information transmission about identifying a motion vector used as the predicted motion vector, but is still deficient because the predicted motion vector that is actually the used median is not the best predicted motion vector to minimally generate the bit quantity required for encoding the differential motion vector.
Generally in video compressive encoding methods, there are provided diverse highly sophisticated encoding techniques subject to competition against each other and then a predetermined evaluation criteria is applied to select an encoding technique showing an optimal encoding efficiency, whereby increasing the encoding efficiency. Compressed data in this way follows a rule or protocol agreed between encoders and decoders to be stored or transmitted in the form of a bitstream which has components each called a syntax element. For example, in response to an encoder required to encode a motion vector in performing a compressive encoding of a video by using the motion compensation method for removing temporal redundancy, different motion vectors are prepared within a search range before searching and finding an optimal motion vector and thereafter the decoder is informed by signaling of which one of the predicted motion vectors was used as described above. In this case, information for notifying what predicted motion vector is possibly used may be deemed to be an example of the syntax element. Alternatively, instead of simply relaying the possibly used predicted motion vector, an encoding may be made with respect to its difference from a certain predefined predicted motion vector such as the median. Instead, adaptively depending on different cases, differently predetermined predicted motion vectors may be used. In those cases, the method of selecting the predicted motion vectors should be also notified to the decoder and the notifying information may also become an example of the syntax element.
When it becomes necessary for the decoder to properly decode data compressed by using more diverse and sophisticated encoding methods, a large amount of syntax element information should be added to the bitstream. In such case, the necessary transmission or storage of the syntax element information accompanies an increased amount of bits and in turn increased amount of data needed to encode still images or videos. Besides, using more diverse and sophisticated encoding methods may improve the encoding efficiency, but a prerequisite for the proper decoding is to notify the decoder of the identity of the encoding method and how it was used. Therefore, the concerned information has to be transmitted or stored, whereby a possible improvement of the encoding efficiency obtained by using more sophisticated encoding methods may be defeated by an overhead from expressing or notifying the decoder of the identity of the possibly used encoding method to cause a higher cost of bits, that is, an effect of increased the syntax elements for signaling the same resultantly contributing to an actual degradation of the video compression performance.
One desirable solution to such a problem is to save the encoder from having to store or transmit a syntax element value it determined by a predetermined encoding criterion to the decoder by having the decoder by itself estimate the syntax element value through its own syntax element estimation process in the course of its decoding operation. However, this solution has a shortcoming that it is not applicable to general cases since decoders are made to carry out the very limited estimation process when encoders can make various decisions determinations. Therefore, the method is selectively applicable only to some cases where a syntax element determined by the encoder is equally estimated by the decoder autonomously. However, since the selective syntax transmission method that involves occasional absent transmissions and positive transmission at other times requires the decoder to determine the presence of a syntax element through the estimating process on a predetermined syntax element by using a previously decoded image value or the decoding process itself, a problem occurs that the step of parsing the syntax may not be separated from the decoding process. Moreover, if an error is contained in the result or process of decoding that is necessary in the course of the syntax element estimation, there is no way to ordinarily determine the presence of the syntax element and thus there is a concern that the decoder might mistakenly attempt a parsing on the corresponding syntax element without even receiving a transmission of the corresponding syntax element or determine the parsing is not needed even after receiving a transmission of the corresponding syntax element, which would cause a serious and critical disturbance in the parsing or decoding process. Against the perceived hardship, the present disclosure is to provide a method and an apparatus for resolving the chronic problem.
In addition, the present disclosure is to provide a method and an apparatus that gather equal syntax elements together by each of predetermined units and transmit them for improving the compression efficiency and simplifying the operations of the decoders.
Additionally, in order to achieve an improved efficiency and solve the above identified problem by precluding the selective transmission and storage of the syntax element, new a method of encoding/decoding such syntax element is needed. The present disclosure encompasses the method and apparatus for encoding/decoding the syntax elements of the still images and videos to achieve the goal described.