Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, smartphones, video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), and extensions of such standards. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video frame or a portion of a video frame) may be partitioned into video blocks, which may also be referred to as treeblocks, coding units (CUs), and/or coding nodes. CUs may be further partitioned into one or more prediction units (PUs) to determine predictive video data for the CU. The video compression techniques may also partition the CUs into one or more transform units (TUs) of residual video block data, which represents the difference between the video block to be coded and the predictive video data. Linear transforms, such as a two-dimensional discrete cosine transform (DCT), may be applied to a TU to transform the residual video block data from the pixel domain to the frequency domain to achieve further compression. Further, video blocks in an intra-coded (I) slice of a picture may be encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to a reference frames.
Spatial or temporal prediction results in a predictive block for a block to be coded. Residual data represents pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded according to a motion vector that points to a block of reference samples forming the predictive block, and the residual data indicating the difference between the coded block and the predictive block. An intra-coded block is encoded according to an intra-coding mode and the residual data. For further compression, the residual data may be transformed from the pixel domain to a transform domain, resulting in residual transform coefficients, which then may be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned in order to produce a one-dimensional vector of transform coefficients, and entropy encoding may be applied to achieve even more compression.
Video encoders may utilize certain units (e.g., mode decision and/or motion estimation units) to encode video information. To speed up processing while minimizing performance degradation, many encoders use pre-processing engines to lower the complexity of these units. One such pre-processing engine is “background detection,” which may be used to distinguish static (or “not changing”) content in a video frame (e.g., background content) from changing (or “moving”) content (e.g., foreground content). Once the background content has been determined, the encoder may apply lower complexity processes (e.g., lower complexity mode decision, simpler motion vector determination, simpler interrupt addition checks, lower motion estimation processes, etc.) when encoding that content, because encoding background frames may simply involve copying content from the appropriate blocks of previous frames to the current frame. In some cases, the results of a motion vector search may be the only requirement for the encoding of a background block.
Video encoding units (e.g., mode decision and motion estimation units) have become more complex and computationally intensive in modern video encoders (e.g., the HEVC encoder). The time and computational resources required for these units to perform certain functions (e.g., detecting background areas) has increased. One reason for this is that older video standards, such as AVC, only utilized transform sizes up to 8×8. However, the more modern HEVC standard utilizes up to 16×16 and 32×32 forward transform and inverse transform sizes. The larger transforms require more complexity and cycles when blocks are analyzed to detect whether they include only background content. In the interest of coding efficiency, the current standards would benefit from a process that reduces the complexity of background detection methods. Some advantages of the techniques disclosed herein relate to improving coding efficiency and reducing computational resource requirements during video encoding by reducing the complexity of background detection methods, which may then allow for less complex mode decision and motion estimation.