A video encoder compresses video information so that more information can be sent over a given bandwidth. The compressed signal may then be transmitted to a receiver that decodes or decompresses the signal prior to display.
Intel's Gen graphics media pipeline leverages an array of cores, or execution units (EUs), to execute a workload. This workload consists of kernels—a set of instructions compromising a program that is executed on the Gen hardware. Predominately, video decoder/encoder kernels contain thread dependency on the coding block level, where a thread must wait on dependency threads before starting its own execution. Under this situation, there is a small subset of the total number of threads that can actively run on the EUs at any given time. This often results in an under-utilization of the EUs. Additionally, the thread parallelism highly depends on the thread dependence pattern.
High Efficient Video Coding (HEVC) is a new video compression standard by the Joint Collaborative Team on Video Coding (JCT-VC) formed by ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG). The traditional thread dependency is a fixed pattern, meaning all the threads in the same thread space have the exact same thread dependency pattern. In some dependency logic (e.g., Intra Prediction in HEVC), with the fixed dependency pattern, we can only keep large thread data granularity (i.e. each thread covers 64×64 pixel data area).