In general, the volume of video data is huge. For this reason, an apparatus for handling video data usually performs high-efficiency encoding on video data when transmitting the video data to a different apparatus, or when storing the video data in a storage device. “High-efficiency encoding” is an encoding process for converting a data string into a different data string to compress data volume.
Intra-picture prediction (intra prediction) coding is known as an example of the high-efficiency coding scheme for video data. This coding scheme is based on the characteristics that video data are highly correlated in terms of space, and is performed without using any other encoded picture. Hence, a picture encoded by intra-picture prediction coding can be decoded only by using information on the picture itself.
As another example used as the high-efficiency coding scheme, inter-picture prediction (inter prediction) coding is known. This coding scheme is based on the characteristics that video data are highly correlated in terms of time. In video data, a picture at some time and a picture subsequent to the picture generally have a high degree of similarity in many cases. Hence, the inter prediction coding uses the characteristics of video data. In general, a video encoder divides an original coding-target picture into a plurality of coding blocks. For each block, the video encoder selects, as a reference region, a region similar to each coding block from a reference picture obtained by decoding an encoded picture, and calculates a prediction error image indicating the difference between the reference region and the coding block, to thereby exclude redundancy in terms of time. By encoding motion vector information indicating the reference region and the prediction error image, the video encoder achieves a high compression ratio. In general, inter prediction coding achieves higher compression efficiency than intra prediction coding.
Typical video coding schemes that uses above-described coding schemes and are widely used are Moving Picture Experts Group phase 2 (MPEG-2), MPEG-4, and H.264 MPEG-4 Advanced Video Coding (H.264 MPEG-4 AVC) standardized by the International Standardization Organization/International Electrotechnical Commission (ISO/IEC). In these coding schemes, the selected one of intra prediction coding and inter prediction coding for each picture is explicitly recorded in the video stream including the encoded video data, for example. The selected prediction coding scheme is referred to as a coding mode. When the selected coding mode is the intra prediction coding mode, a video encoder can select any one of a plurality of prediction modes, each of which specifies a method of generating a prediction block of a coding block.
FIG. 1 is a view illustrating eight types of prediction modes in accordance with H.264 used for a coding block of 4×4 pixels. As illustrated in FIG. 1, a prediction block of a coding block 100 is generated on the basis of the values of respective encoded pixels 101 locating around the coding block 100. In FIG. 1, each arrow 102 indicates a reference direction in the corresponding prediction mode. For example, in Prediction mode 0, each of the values of the pixels on each vertical line of the prediction block is set at the value of the pixel upward adjacent to the coding block 100 on the vertical line.
In these video coding schemes, I picture, P picture, and B picture are defined. I picture is a picture encoded only by using information of the picture itself. P picture is a picture that is encoded according to inter-coding using information on an encoded picture. B picture is a picture that is encoded according to bidirectional predictive coding using information on two encoded pictures. The directions indicating two reference pictures, which are referred to by a B picture, in terms of time are denoted by L0 and L1. One of the two reference pictures referred to by the B picture may be preceding the B picture in terms of time, and the other reference picture may be subsequent to the B picture in terms of time. In this case, for example, the L0 direction corresponds to the forward direction from the coding-target picture, i.e., the B picture, in terms of time, while the L1 direction corresponds to the backward direction from the coding-target picture in terms of time. Alternatively, both of the two reference pictures may be pictures preceding the B picture in terms of time. In this case, both the L0 direction and the L1 direction correspond to the forward direction from the coding target picture in terms of time. Further, both of the two reference pictures may be pictures subsequent to the B picture in terms of time. In this case, both the L0 direction and the L1 direction correspond to the backward direction from the coding-target picture in terms of time.
For real-time communication of video data that are encoded in accordance with these coding schemes, attempts have been made reducing delay in video encoders and video decoders. For example, in a scheme aiming to reduce delay according to H.264, the backward prediction, in which a picture subsequent to a coding-target picture in terms of time is referred to, is not employed in order to prevent delay due to rearrangement of pictures. A video encoder divides a picture into blocks each having 16×16 pixels. The obtained blocks are referred to as macro-blocks. A line of macro-blocks is referred to as a slice. Macro-blocks can be categorized into intra-macro-blocks for intra prediction coding, and inter-macro-blocks for inter prediction coding. To further reduce delay, an intra-refresh scheme is also proposed, in which all the data in a slice are encoded by using as intra-macro-blocks (see Japanese Examined Patent Publication No. H06-101841, for example).
With reference to FIGS. 2A and 2B, the intra-refresh scheme will be described. FIG. 2A illustrates an example in which a refreshed region increases vertically, while FIG. 2B illustrates an example in which a refreshed region increases horizontally. In FIGS. 2A and 2B, the horizontal axis represents time. Each of pictures 201 to 205 is encoded as a P picture or a B picture which only refers to a preceding picture. The video encoder gradually shifts a position of a slice to which the intra-refresh is applied, from the 0-th macro-block line to the t-th macro-block line and then the (t+1)-th macro-block line for each picture. In this way, the video encoder cyclically shifts the slice to which the intra-refresh is applied in the entire picture in a certain refresh cycle. For example, in FIG. 2A, a refreshed region 210, which is a region through which a slice to which the intra-refresh is applied has shifted, is extended downward with time. By contrast, in FIG. 2B, a refreshed region 220 is extended rightward with time. Each block in a refreshed region, i.e., the region above a refresh boundary 230 in FIG. 2A, is to be encoded by referring only to a refreshed region of a preceding encoded picture or an encoded, refreshed region of the current picture. Since the entire picture is refreshed after a slice to which the intra-refresh is applied has traversed throughout the picture, the video encoder can resume decoding using the picture after the refresh even when an error that makes impossible to decode the picture occurs due to a transmission error or the like. Moreover, the video encoder can decode from a middle of a video stream. Further, since no I picture having a large amount of information is used, the buffer in each of the video encoder and the video decoder can be small in size. As a result, latencies of the buffers can be decreased. Furthermore, as illustrated in FIG. 2B, using a vertical macro-block line for a slice to which the intra-refresh is applied can make the amount of information per macro-block line even. In this way, control of information amount by the video encoder is made simpler.
Note that intra prediction coding is not necessarily performed for each macro-block included in a slice to which the intra-refresh is applied, and the video encoder may perform inter prediction coding by referring only to a refreshed region of a preceding encoded picture. However, it is preferable, in consideration of coding efficiency, that the video encoder use intra-macro-blocks as macro-blocks included in a slice to which the intra-refresh is applied.
With reference to FIG. 3, a method of encoding in a picture when applying intra-refresh will be described. In a picture 300, a region 302, which is locating on the left of a refresh boundary 301, is a refreshed region through which a slice to which the intra-refresh is applied has traversed. By contrast, a region 303, which is locating on the right of the refresh boundary 301, is an unrefreshed region through which a slice to which the intra-refresh is applied has not traversed yet. In FIG. 3, each block 304 is a macro-block.
Inter prediction coding is performed on each inter macro-block, which is a macro-block to be encoded in the inter prediction coding, in the refreshed region 302 by referring to a preceding encoded picture. As to each intra macro-block, which is a macro-block to be encoded in accordance with the intra prediction coding, included in a slice 305, which is to which the intra-refresh is applied and is adjacent to the refresh boundary 301, usable prediction modes are restricted so that the intra macro-block would not refer to any pixel locating on the other side of the refresh boundary 301. The video encoder can generally prohibit prediction by using slices, for each macro-block in a slice, based on the data on a different slice. In addition, methods of directly prohibiting the use of one or more particular prediction modes without using a slice are known. For example, when the right end of the coding block 100 in FIG. 1 is adjacent to a refresh boundary, the use of Prediction modes 3 and 7 is prohibited for the coding block 100.