1. Field of the Invention
The present invention relates generally to digital video and, more particularly, to digital video reconstruction.
2. Discussion of Prior Art
Video processing has evolved with the economic phases of video formats. In early analog video, filtering and delay line manipulation of continuous signals, sc(t), typically provided only one-dimensional (xe2x80x9c1-Dxe2x80x9d) processing of a small neighborhood of data along a single scan line. Compression-primarily signal limiting and added component interleaving-included band-limiting, interlace scanning, RGB-to-YUV color-space conversion, subcarrier insertion and vestigial side-band modulation. Enhancement processing stages, such as comb filtering, YUV-to-RGB color-space restoration, were also added for correction of compression effects.
Two-dimensional (xe2x80x9c2Dxe2x80x9d) digital video enabled more precise multiple scan-line processing of discrete signals, s(t). Among other advantages, single-image compression techniques, such as the Joint Photographic Experts Group standard (xe2x80x9cJPEGxe2x80x9d), could now be used to provide digital video image reproduction without perceivable artifacts or xe2x80x9ctransparentxe2x80x9d digital coding. Newer enhancement processing stages, such as time base correction (xe2x80x9cTBCxe2x80x9d), 2-D comb filtering, edge enhancement and noise reduction, were also enabled.
Recently, 3-dimensional or xe2x80x9c3-Dxe2x80x9d video processing (i.e. of horizontal, vertical and temporal image aspects) has emerged, most notably, Moving Pictures Experts Group or xe2x80x9cMPEGxe2x80x9d standards. MPEG-1, for example, introduced block-based motion compensated prediction (xe2x80x9cMCPnxe2x80x9d), which describes the interframe movement of blocks displaced from arbitrary locations. Using MCPn, rudimentary groups of pictures or xe2x80x9cGOPsxe2x80x9d are formed in which a higher-bitrate xe2x80x9cintra-codedxe2x80x9d or xe2x80x9cIxe2x80x9d frame/macroblock can be followed by lower-bitrate differentially-coded predictive and bi-directional or xe2x80x9cPxe2x80x9d and xe2x80x9cBxe2x80x9d frames/macroblocks (e.g. IBBPBB). Advantageously, differential-coding typically provides a three-fold compression improvement over still-frame digital image coding. Further, such 3-D coding techniques as synthetic coding (e.g. MPEG-4) are expected to provide even greater compression through more advanced motion models than those used according to current block-based coding.
Despite such advances, however, traditional processing approaches continue to be utilized. For example, while coding, decoding and enhancement processing are typically included within matched encoder-decoder pairs or xe2x80x9ccodecs,xe2x80x9d such processing continues to be conducted as separate and distinct processing stages. One likely reason is that the predominant video codec standards MPEG and its progeny define the generic standard-compliant decoder as one that uses proscribed rules and algorithms or xe2x80x9csemanticsxe2x80x9d that react to coded bitstream elements to provide a one-to-mapping from the input bitstream into an expected output sequence of samples; using such standards, the resulting uncompressed video signals resemble analog signals closely enough that traditional post processing enhancement methods can be readily applied. Another possible reason, among others, is that the conversion of the intermediate decoder output stream into a display format is usually defined by a separate application specification, such as ATSC, DVD or DVB and their progeny.
As shown in FIG. 1, for example, a conventional MPEG encoder 101 typically comprises separate processing stages for pre-processing 111, coding 113 and (optionally) multiplexing 115 a received video source; complimentarily, an MPEG decoder 103 includes stages for de-multiplexing 131 received standard-coded data, decoding 133 the de-multiplexed data, and then post-processing 135 the resulting decoded data samples. Preprocessing stages typically provide for artifact reduction (e.g. noise filtering, time-base correction, etc.) and codec accommodation (e.g. anti-alias low-pass filtering; entropy minimization filtering, downsampling, etc.). Post-processing stages, which conventionally typically provide for codec accommodation (e.g. de-interlacing) and display format conversion, but can also enable image improvement.
Unfortunately, such traditional approaches are capable of only limited image improvement. To make matters worse, conventional approaches require substantial estimation, iteration and computation, which need is exacerbated by real-time operation required for continuous video display. De-interlacing, for example, aims to convert an interlaced signal for progressive display. However, while an interlace signal might contain some progressive content (e.g. 3:2 pulldown film) or interlace coding (e.g. MPEG interlace DCT and field prediction tools), the challenge of de-interlacing remains that of using decoded samples to estimate what the decoded image content would have been if it had been progressively scanned. To make matters even more difficult, the most effective estimation technique potentially useable by conventional codecs, for tracking decoded objects across several frame periods and then filtering along those points, is very computationally expensive.
Other feature enhancements are similarly limited by traditional processing approaches. For example, conventional frame rate conversion uses repeated frames to increase display rate, and frame interpolation to improve object motion smoothness; however, conventional frame interpolation suffers from object tracking requirements as with de-interlacing. Motion blur reduction can also be used to recover some detail lost to object motion during camera integration; however, detail needed by the xe2x80x9cinverse blurring algorithmxe2x80x9d is likely lost through compression and decoding, and only minor improvement can be achieved by fusing information across frame periods of decoded sample data. Feature enhancement can further be used to emphasize detail that is otherwise below the human visible threshold. However, sub-threshold emphasis is often at odds with conventional encoder filtering-out of imperceptible image attributes, and conventional high-pass filtering of decoded data samples is capable of providing only limited feature enhancement and can actually increase the visibility of compression artifacts.
In emerging xe2x80x9csuperresolutionxe2x80x9d techniques, an attempt is made to provide for image restoration and enhancement using xe2x80x9cenhancement-facilitatingxe2x80x9d information found to exist within decoded data samples. For example, bitstream vectors are used in an attempt to link areas in the original reference picture (i.e. prior to quantization) which most closely resemble the current picture. Typically, each vector is refined to half- pixel accuracy by comparing the original current macroblock against the decoded reference picture. Each final selected vector then forms a prediction address from the decoded reference picture. Further accuracy for the current macroblock is also attempted by adding the DCT-coded prediction error to the prediction formed in an earlier motion compensated prediction or xe2x80x9cMCPxe2x80x9d stage. One model of video restoration theory, for example, describes the observed signal, g, as the original signal, s, convoluted by the point spread function distortion (xe2x80x9cPSF-distortionxe2x80x9d) D plus the noise, v, as given by the following equation 1:
g=Ds+vxe2x80x83xe2x80x83[Equation 1]:
Unfortunately, the existence of enhancement information in decoded image samples is only a fortuitous byproduct of pre-encoder processing, preprocessing, coding and decoding, and conventional superresolution has not yet been proven viable using real-world encoded (i.e. and then decoded) video. Thus, while some enhancement capability has been demonstrated in controlled contexts, conventional superresolution, as with other conventional techniques, is found to be computationally expensive and unreliable. Worse yet, the inconsistent intra-frame and temporal enhancement produced by such methods are often obvious and distracting to a viewer, such that the results produced might be even more detrimental than without such enhancement.
Accordingly, there is a need for apparatus and methods capable of more effectively performing video decoding and enhancement.
Broadly stated, the present invention provides for advanced processing of a standard-coded digital video signal using information other than standard-decoded data samples. Preferably, such processing is conducted by an advanced decoder comprising coding, decoding and enhancement tools capable of utilizing bitstream data to perform both decoding and image enhancement. Advanced processing further preferably includes such image enhancement capability as resolution enhancement, improved motion portrayal and artifact suppression, but can more generally include these and/or a wide variety of other enhancements as might be desirable in accordance with a particular application.
More specifically, the invention breaks with traditional processing segmentation and instead provides, whether actually implemented in a more integrated or separated configuration, for more integrated codec stage operation and data utilization. In one aspect, the invention provides for the use of both coding and decoding type tools in performing standard-compliant decoding and image enhancement operations. In another aspect, the invention enables the use of bitstream state elements for facilitating enhancement processing. In a further aspect, the invention enables enhancement not inconsistent with conventional superresolution that is also capable of utilizing compressed image representations. The invention further provides for advanced decoder implementations capable of providing the aforementioned as well as yet other decoding and image enhancement capabilities.
In accordance with the present invention, enhancement processing is preferably capable of utilizing techniques consistent with those teachings broadly referred to by the above-referenced co-pending patent applications as xe2x80x9csuperresolutionxe2x80x9d and xe2x80x9creverse superresolutionxe2x80x9d (or xe2x80x9cSRxe2x80x9d and xe2x80x9cRSRxe2x80x9d respectively). It will become apparent, however, that the RSR does not describe merely the reverse of SR, even as SR is extended beyond conventional meaning by such applications to incorporate their teachings.
However, in order to further the useful broad classifications established by such applications, SR is used herein in the context of enhancement processing to refer to all quality/functionality improving reconstruction (i.e. except standard decoding); in contrast, RSR will refer broadly to all advanced coding-type techniques. Additionally, the labels xe2x80x9cconventional-SRxe2x80x9d and xe2x80x9cadvanced-SRxe2x80x9d will be used where operability-inhibiting limitations of conventional-SR might not be readily apparent. It should further be noted that the term xe2x80x9cstandard,xe2x80x9d as used herein, refers not only to formally standardized protocols, techniques, etc., but also to other methods and apparatus to which RSR, advanced-SR and/or other teachings of the present invention are capable of being applied.
Accordingly, in a preferred embodiment, the invention comprises an integrated advanced decoder capable of performing advanced standard-compliant decoding and enhancement processing. The advanced decoder preferably receives and parses standard-coded bitstreams, providing state and other bitstream elements and metrics for use in an integrated manner in performing enhancement processing. Such processing is further capable of utilizing techniques and/or apparatus (or xe2x80x9ctoolsxe2x80x9d) that might be more traditionally considered xe2x80x9cdecoding-relatedxe2x80x9d and/or xe2x80x9cencoding-related,xe2x80x9d and more preferably, advanced-SR and RSR respectively (e.g. using diffused data and/or meta data as taught by the above-mentioned prior applications).
Advantageously, the present invention is capable of providing image enhancement in a robust and accurate manner. Since additional information utilized in accordance with the present invention provides insight as to the nature of the source image data and subsequent encoding, substantially less initial xe2x80x9cguessworkxe2x80x9d is required. In the case of bitstream data utilization, knowledge of how source information was encoded provides clues as to the source image itself, as well as how to better conduct enhancement processing. In the case of received diffused data, meta data and the like, actual multi-dimensional image elements and processing information can be used in conducting enhancement processing. In both cases, advanced decoding tools can also be utilized for advanced enhancement processing and for more advanced decoding. Also, in both cases, such advanced processing can be conducted with little or no impact on bitrate, such that data transfer and storage can remain essentially unaffected.
Further advantages also arise from the use of encoding-type processing during decoding. While handling each coding and decoding stage as a separate and distinct process has become well-entrenched in image processing, and particularly video, the present invention instead seeks to find the most effective overall processing path in order to achieve desired results. Stated alternatively, image processing is viewed as occurring within a super-domain in which any number of representational and processing capabilities might be more effectively conducted in accordance with the overall results to be achieved (e.g. using an appropriate knowledge base, tools and/or data). Among other benefits, many of which are noted in the above-referenced prior applications, the use of additional information for decoding and enhancement are greatly facilitated.
Yet other advantages include that the invention is susceptible to numerous variations in accordance with various applications and emerging standards. For example, the decoding and image enhancement capabilities of the invention can be configured in accordance with a variety of integrated, separately implemented and combined implementations (e.g. integrated decoding and enhancement; decoding and post-processing, etc.). Further, new standards can readily utilize decoding and/or enhancement techniques and tools as presented herein, and it is expected that newer standards might incorporate certain aspects of the invention, while still further advanced tools are developed and utilized in accordance with the teachings herein.