Video compression encoding technologies are widely used today. Video compression encoding technologies are used in a wide variety of applications such as digital broadcasting, distribution of video contents on optical discs, and video delivery over the Internet or the like. Further, video compression encoding technologies have progressed to the point where video signals can be encoded with low bit rates and high compression rates as well as high image quality. Examples of technologies for encoding video signals to generate encoded data and decoding encoded video include: H.261 and H.263 standardized by ITU (International Telecommunication Union), MPEG (Motion Picture Experts Group)-1, MPEG-2, and MPEG-4 developed by ISO (International Organization for Standardization), and VC-1 developed by SMPTE (Society of Motion Picture and Television Engineers). These technologies are widely used as international standards. H.264/MPEG-4 AVC (Advanced Video Coding) jointly standardized by ITU and ISO is also becoming widespread. Hereinafter, H.264/MPEG-4 AVC will be referred to as H.264. Further, a new video compression coding standard, H.265/MPEG-H HEVC (High Efficiency Video Coding) was standardized in 2013. Hereinafter, H.265/MPEG-H HEVC will be referred to as H.265. It is reported that H.265 can compress a video to about 50% of the original data size with video quality equivalent to that of H.264. H.265 is expected to find use in a wide area of fields.
These video coding technologies are implemented by a combination of multiple elemental technologies such as motion compensated prediction, orthogonal transform of prediction error images, quantization of orthogonal transform coefficients, and entropy coding of quantized orthogonal transform coefficients. These video coding technologies are called hybrid coding.
In the motion compensated prediction mentioned above, a motion vector which represents motion of images between the previous frame and the current frame in a video is searched for each MB (Macro Block). In the following description of the present invention, a process for searching for motion vectors will be referred to as a “motion vector search process”.
In a motion vector search process, processing such as predicted image generation, rate distortion cost calculation, comparison, and selection for each of many candidate vectors is repeated on each MB of an encoding object image. Accordingly, the motion vector search process requires a large amount of computation. The amount of computation required for the motion vector search process can occupy most of computation amount in the whole video encoding processing. It is therefore important to speed up the motion vector search process in order to speed up video encoding.
The trends in the design of processors that perform processing are toward multi-core and many-core processors. CPUs (Central Processing Units) of typical personal computers often include dual-core or quad-core processor cores. Some high-end CPUs include eight or more cores. Many-core accelerators in which 50 or more cores of processor cores are integrated have been commercialized. Further, GPUs (Graphics Processing Units) used for three-dimensional graphics processing are large-scale parallel processors including several thousands of processor cores. There is a technology called GPGPU (General Purpose Computing on Graphics Processing Unit) which uses such GPUs in other applications as well. If processing matches the properties of GPU, a GPU can be used to perform the processing several times or several tens of times faster than the processing performed using a CPU.
If the vector search process described above can be parallelized using such a multi-core or many-core processor, the motion vector search process can be significantly sped up. Speeding up the motion vector search process can speed up video encoding.
Section 4.3.1 of NPL 1 discloses a technology for performing motion vector search processes in parallel. In a motion vector search process, already encoded vectors of adjacent blocks are used. In other words, there are dependencies among adjacent blocks. Accordingly, arbitrary blocks cannot be processed in parallel. In the technology disclosed in NPL 1, processing is parallelized among a plurality of MBs that are apart from one another in a frame and are in a given relative positional relationship with one another, as illustrated in FIG. 14 in NPL 1. The parallel processing is started at the MB at the upper left corner of the frame and proceeds toward the lower right corner while changing combinations of a plurality of MBs in the given relative positional relationship. This process is called wavefront processing.
Because processing such as predicted image generation, rate distortion cost calculation, comparison and selection for each of many candidate vectors is repeated on each MB of an encoding object image, the amount of computation required for the motion vector search process is large and may occupy most of computation amount for the whole video encoding processing. It is therefore important to speed up the motion vector search process in order to speed up video encoding.