1. Field of the Invention
The present invention relates generally to digital video and, more particularly to digital video coding and reconstruction.
2. Discussion of Prior Art
The recent introduction of digital video technology holds great promise for the future of multimedia. Unlike its analog predecessors, digital video is capable of being stored, transferred, manipulated, displayed and otherwise processed with greater precision by a wide variety of digital devices. Digital processing can also be more readily conducted in conjunction with various other digital media (e.g. graphics, audio, animation, virtual-reality, text, mixed media, etc.), and with more reliable synchronization and lower generational degradation.
Successful deployment of digital video is largely due to the wide adoption of digital video standards, such those espoused by the Moving Picture Experts Group (xe2x80x9cMPEG specificationsxe2x80x9d). While often hindered by proliferated compatibility with analog conventions (e.g. interlace video) and other factors, standardized digital constructs nevertheless provide substantial compression via common video signals and produce conventionally xe2x80x9cacceptablexe2x80x9d perceived image quality.
FIG. 1, for example, illustrates a typical standard-compliant, one-to-many encoder and deterministic-decoder pair or xe2x80x9ccodec.xe2x80x9d As shown, codec 100 includes encoder 101 and decoder 103, which are connected via communications system 102. Operationally, pre-processor 111 typically receives, downscales and noise filters video source s to remove video signal components that might otherwise impede encoding. Next, encode-subsystem 113 compresses and codes pre-processed signal sxe2x80x2, producing encoded-signal b. Multiplexer 115 then modulates and mutiplexes encoded-signal b and transfers resultant signal bxe2x80x2 to communications subsystem 102. Communications subsystem 102 (typically not part of codec 100) can be a mere data transfer medium or can also include system interfaces and/or subsystems for combining, scheduling and/or delivering multiple singular and/or mixed media signals to a receiver. Decoder 103 typically operates at a receiver to reconstruct video source s. More specifically, signal bxe2x80x2 is demodulated and demultiplexed by demultiplexer 131, decoded by decode-subsystem 133 and then post-processed (e.g. filtered, converted, etc.) by post-processor 135. Following decoding, decoded-signal rxe2x80x2, which resembles the source signal s, is displayed and/or stored.
FIGS. 2 and 3 respectively illustrate encode-subsystem 113 and decode-subsystem 133 of FIG. 1 in greater detail. Beginning with FIG. 2, the downscaled video signal sxe2x80x2 from pre-processor 111 (FIG. 1) is received, optionally formatted, and then stored in frame store 203 by capture unit 201. Captured signals cxe2x80x2 are represented as a sequence of two-dimensional sample lattices corresponding to video frames. (The number of captured frames contemporaneously stored by frame store 203 is determined by encode-subsystem latency and the analysis window size utilized by analysis unit 202.) Stored frames are transferred to analysis unit 202 and otherwise retrieved multiple times as needed for actual encoding. Analysis unit 202, for example, measures standard-specific properties of each stored frame, which it transfers as metrics to decision unit 204.
Next, the analysis unit metrics are inserted into an encoding formula, producing the coding modes according to which encode-subsystem 205 represents pre-processed frames as standard-compliant encoded-frames. More specifically, temporal prediction unit 207 retrieves frames from frame store 208, uses captured-frames to form a coarse current-frame prediction and then refines this prediction according to prior-encoded frames. Decision unit 204 then uses the refined predictions and metrics to control current frame coding. Finally, encode unit 205 uses a current coding mode to form, on a frame-area (xe2x80x9cmacroblockxe2x80x9d) basis, a coded frame.
Continuing with FIG. 3, a typical decode-subsystem 133 performs a simpler, deterministic operation than encode-subsystem 113, using the frame-data of each encoded frame to determine the proper reconstruction of a corresponding decoded frame. (For clarity, elements complimentary to those the encode-subsystem of FIG. 2 are correspondingly numbered.) Operationally, parsing engine 301 de-multiplexes the received variable length encoded-bitstream b. Thereafter, decode unit 305 provides spatial frame elements and temporal prediction unit 307 provides temporal frame elements which reconstruction unit 306 reconstructs into decoded frames. Frame store 303 provides for frame reordering of differentially-coded adjacent frames (discussed below) and can also serve as a frame-buffer for post-processor 135 (FIG. 1).
In addition to current-frame prediction (above), standard-compliant codecs also provide for compression through differential frame representation and prediction error data. MPEG-2 coded video, for example, utilizes intra (xe2x80x9cIxe2x80x9d), predictive (xe2x80x9cPxe2x80x9d) and bi-directional (xe2x80x9cBxe2x80x9d) frames that are organized as groups-of-pictures (xe2x80x9cGOPsxe2x80x9d), and which GOPs are organized as xe2x80x9csequences.xe2x80x9d Typically, each GOP begins with a I-frame and then two B-frames are inserted between the I frame and subsequent P frames, resulting in a temporal frame sequence of the form: IBBPBBPBB . . . I-frames represent a complete image, while P and B frames can be coded respectively as differences between preceding and bi-directionally adjacent frames (or on a macroblock basis). More specifically, P and B frames include motion vectors describing interframe macroblock movement. They also include prediction data, which describes remaining (poorly motion-estimated or background) macroblock spatial-pattern differences, and prediction error data, which attempts to fill-in for or xe2x80x9cspackelxe2x80x9d data lost to prediction inaccuracies. Prediction and prediction error data are also further compressed using a discrete cosine transform (xe2x80x9cDCTxe2x80x9d), quantization and other now well-known techniques.
Among other features, MPEG and other standards were intended to meet emerging coding needs. For example, they specify protocols rather than device configurations to enable emerging, more efficient protocol-compliant devices to be more readily utilized. (One purpose of GOPs, for example, is to avoid proliferation of drift due to differing decoder implementations by assuring periodic I-frame xe2x80x9crefreshes.xe2x80x9d) MPEG-2 further provides profiles and levels, which support emerging higher resolution video (e.g. HDVD, HDTV, etc.). Scalability modes are also provided. Much like adding missing prediction error data to prediction data, MPEG-2 scalability modes allow xe2x80x9cenhancementxe2x80x9d frame data to be extracted from xe2x80x9cbasexe2x80x9d frame data during encoding (typically using a further encode-subsystem) and then optionally re-combined from the resulting base and enhancement xe2x80x9clayersxe2x80x9d during decoding.
Unfortunately, standards are ultimately created in hindsight by committee members who cannot possibly foresee all contingencies. Worse yet, new standards materialize slowly due to the above factors and a need to remain compatible with legacy devices operating in accordance with the existing standard.
For example, while current standard-compliant codecs produce generally acceptable quality when used with conventional standard-definition television (xe2x80x9cSDTVxe2x80x9d), resultant signal degradation is perceivable and will become even more so as newer, higher-definition devices emerge. Block-based coding, for example is non-ideal for depicting many image typesxe2x80x94particularly images that contain objects exhibiting high velocity motion, rotation and/or deformation. In addition, standard compression is prone to over-quantization of image data in meeting bitrate and other requirements. Further, even assuming that an ideal low-complexity image well suited to block-based coding is supplied, image quality is nevertheless conventionally limited to that of the pre-processed signal. Defects in the source video itself, such as blur and noise, are also not even considered.
Another example is that conventional xe2x80x9cdata adding/layeringxe2x80x9d (e.g. prediction error, scalability, etc.) hinders coding efficiency. Such often data-intensive additions might well result in excessive bit-rate, which excess must then be contained through quality-degrading methods such as quantization. Thus, conventional scalable coding is rarely utilized, and it is unlikely that high-definition media (e.g. HDTV), while ostensibly supported, can be provided at its full quality potential within available bandwidth. Other applications, such as video conferencing, are also adversely affected by these and other standard coding deficiencies.
A new approach that promises to deliver better quality from standard-coded video is xe2x80x9csuperresolution.xe2x80x9d Conventionally, superresolution (xe2x80x9cSRxe2x80x9d) refers to a collection of decoder-based methods that, during post-processing, reuse existing standard-decoded image data in an attempt to remove blur, aliasing, noise and other effects from an image. The term SR, while previously applied to producing a single higher-resolution image, now also encompasses using a series of decoded video frames for video enhancement as well.
In summary, conventional SR methods: identify common image portions within a predetermined number of decoded image frames; create a model relating the decoded images to an unknown idealized image; and set estimated criteria that, when met, will indicate an acceptable idealized image approximation. A resultant SR-enhanced image is then produced for each SR-image portion as a convergence of the model and criteria in accordance with the corresponding decoded-image portions. A review of known and postulated coding and SR methods are given, for example, in the Prentice Hall text. Digital Video Processing by Murat Tekalp of the University of Rochester (1995).
Unfortunately, while promising, conventional SR effectiveness is nevertheless limited. For example, conventional SR is reliant on a specific codec and decoded frame and macroblock quality produced by that codec. Not only is such image data merely the fortuitous byproduct of original image production and prior processing, but it is also subject to the codec-specific downsampling, image representation, bitrate-limiting, data layering and other deficiencies given above. Conventional SR also relies on estimation, interpolation and computationally intensive iteration, the inexactness of which is exacerbated by real-time operation required in order to continuously display the SR-enhanced video. As a result, inconsistent intra-frame and inter-frame improvement, as well as other codec and SR artifacts, might be even more apparent to a viewer than without conventional SR-enhanced decoding.
Accordingly, there is a need for apparatus and methods capable of providing high-quality imaging in conjunction with but resistant to the limitations of standard codecs.
Broadly stated, the invention provides low-bitrate modified coding of a video signal enabling improved-quality upon reconstruction (e.g. decoding). The invention also enables further improvement when used in conjunction with advanced reconstruction in accordance with the invention.
More specifically, in one aspect, the invention provides for defining and exploiting image-aspect and image-coding redundancies, thereby enabling utilization of such redundancies to convey more complete information. In another aspect a super-domain model facilitates advanced-coding in a superimposed manner with standard-coding, thereby avoiding conventional limitations and enabling optimally-coded image information to be made available for transmission, storage, reconstruction and other uses. Multi-dimensional image-portion aspect diffusion and registration capabilities, direct coding/decoding and other tools also enable coding improvements to be efficiently integrated in a static and/or dynamic manner with standard-coded data. Analysis, susceptibility determination, consistency and other quality-assurance tools further facilitate diffusion, registration and other optimizations. In another aspect, the invention provides an advanced encoder capable of dynamic low-bitrate, advanced-coding that, upon reconstruction, can produce standard/enhanced quality images and/or other features. In yet another aspect, the invention further provides an advanced decoder that is capable of producing higher-quality and otherwise improved reconstructions in response to receipt of modifiedly-coded data and/or other information, among still further aspects.
In accordance with the present invention, advanced coding preferably includes techniques consistent with those teachings broadly referred to by the above-referenced co-pending patent applications as xe2x80x9creverse superresolution.xe2x80x9d It will become apparent, however, that the term reverse superresolution or xe2x80x9cRSRxe2x80x9d does not describe merely the reverse ofxe2x80x9csuperresolutionxe2x80x9d or xe2x80x9cSR,xe2x80x9d even as the term superresolution is extended beyond its conventional meaning by such applications to incorporate their teachings. For example, one advantage of RSR is that RSR can provide bitrate-reduced standard or modified quality in conjunction with conventional standard-decoders (i.e. without SR-enhancement). However, in order to extend the useful broad classifications established by such applications, SR will be even further extended herein in the context of codecs to refer to all quality/functionality improving reconstruction (i.e. except standard decoding); in contrast, RSR will refer broadly to all advanced coding-related techniques consistent with the teachings herein. Additionally, the labels xe2x80x9cconventional-SRxe2x80x9d and xe2x80x9cadvanced-SRxe2x80x9d will be used where operability-inhibiting limitations of conventional-SR might not be readily apparent. It should further be noted that the term xe2x80x9cstandard,xe2x80x9d as used herein, refers not only to formally standardized protocols, techniques, etc., but also to other methods and apparatus to which RSR, advanced-SR and/or other teachings of the present invention are capable of being applied.
Accordingly, in a preferred embodiment, an RSR-enhanced encoder receives source image-data as well as available image-data creation, prior processing and/or user information. The enhanced encoder further determines the susceptibility of the image-data to available quality improvement. Preferably concurrently with such susceptibility determination, the enhanced encoder also determines opportunity within standard-compliant video coding for incorporating implemented quality improvements. The encoder further preferably dimensionally composites or xe2x80x9cdiffusesxe2x80x9d improvements into and otherwise optimizes the encoded data stream. Additionally, the encoder provides for further diffused and/or a minimized amount of added data and/or information in either a unitarily (e.g. conventional encoder-decoder operational pairing) or distributed manner in accordance with applicable reconstruction and/or other system constraints. Such determining and coding tools are further preferably modifiably provided and can apply to reconstruction generally, standard-decoding, and conventional/advanced SR, among other possibilities.
The preferred RSR-enhanced encoder is further preferably operable in accordance with advanced-reconstruction. More preferably, an advanced SR-decoder is provided which is capable of conducting advanced local and/or distributed reconstruction in accordance with diffused and/or added information, cooperatively with advanced coding and/or in accordance with standard-decoding.
Advantageously, the invention is capable of providing determinable-quality, lower bitrate and/or otherwise improved operation in a standard-compliant, yet efficiently adaptable and scalable manner. For example, between standard introductions, otherwise non-compliant improvements can be readily incorporated into systems utilizing standard-complaint codecs; assuming such improvements are adopted by a revised or new standard, yet further improvements can be readily incorporated in accordance with the new standard, and so on.
In addition, more effective and precise functionality can be achieved using matched and/or unmatched encoders and decoders. For example, the invention enables more effective results not only from standard-compliant, non-scalable and scalable decoding, but also from conventional SR-enhanced decoders and advanced-SR reconstruction.
The invention further enables quality-improvement to be achieved using standard quality as a modifiable concurrently-deliverable baseline. For example, standard or improved quality/functionality can be provided at significantly reduced bitrate. In addition, the same (or further modified) coded image data, without added bandwidth, can produce standard quality/functionality with standard-compliant systems and improved quality/functionality with other systems. Still further, standard and/or improved quality/functionality can be dynamically provided in accordance with static or dynamically varying quality, bandwidth and/or other operational constraints, among other examples.
Another advantage is that the invention is capable of providing such improvement in a manner that is adaptable to disparate standards, tools, operational constraints and implementation configurations, only a few of which might be specifically noted herein. Thus, for example, investment in legacy and emerging technologies is preserved.
The invention also makes possible practical determinable-quality reconstruction, in part, by increasing efficiency., reducing localized and real-time processing workload, enabling decoder-based coding-type operations and/or by reducing bandwidth requirements, among yet other advantages.
These and other objects and advantages of the present invention will become apparent to those skilled in the art after considering the following detailed specification, together with the accompanying drawings.