1. Field of the Invention
The present invention relates generally to digital video processing and, more particularly, to digital video encoding.
2. Discussion of Prior Art
Significant advances continue to be made in video processing technology. Analog video (e.g. NTSC, PAL, etc.), which provided only limited compression through typically single scan-line or one-dimensional (“1-D”) processing, has been surpassed by more efficient multiple scan-line or two-dimensional (“2-D”) digital video. 2-D video has in turn been surpassed by horizontal, vertical and temporal or three-dimensional (“3-D”) digital video. Even MPEG-1, which was once the predominant mainstream 3-D video codec standard, has also recently been surpassed by the more versatile and higher bitrate capable MPEG-2. Now MPEG-2 is today's predominant mainstream compression standard; however, work is already underway to develop still more-efficient techniques for providing even higher-compression ratios, and yet substantially perceivable-artifact free or “transparent” video coding (e.g. MPEG-4).
Despite ongoing advancements, however, remnants of earlier video coding nevertheless remain. For example, the broad processing stages of the FIG. 1 encoder-decoder pair or “codec” are still typically utilized. As shown, encoder 101 includes a pre-processor 111 for refining the source video bitstream to facilitate coding, an encode-subsystem 113 for performing coding, and an optional multiplexer 115 for combining multiple data streams (not shown). Complimentarily, a typically matched decoder 103 includes an optional de-multiplexer 131, a decode-subsystem 133 for reconstructing video frames, and a post-processor for removing coding artifacts and performing final processing (e.g. display format conversion).
Another remnant of earlier video coding is the continued use of interlacing. Initially, interlacing (i.e. using alternate scan line sequences in overlaying fields) and other data-layering techniques were introduced to supplement the limited compression capability of analog video and thereby reduce bandwidth requirements. Interlacing also enabled an image to be captured as half-resolution fields (using less-expensive half-scan capturing devices) that could later be combined for display, and was quickly integrated as a basic format within most video devices (e.g. commercial television cameras, televisions, etc.). While the advent of 3-D (i.e. spatio-temporal) video and full-resolution capturing has obviated a specific technical need, cost concerns have instead resulted in the proliferation of interlacing with newer and emerging video devices and application standards (e.g. consumer cameras, VCRs, DVD, HDVD, HDTV, etc.).
The use of an interlaced video format is, however, problematic. One reason is that MPEG-1 and other early compression standards enable only frame-based continuous-scan or “progressive” input, which is not directly compatible with interlaced sources. MPEG-1, for example, expects a progressive source-input-format (“SIF”) of 352×240 pixels per frame at 30 frames per second or 352×288 pixels per frame at 25 frames per second and does not recognize the subdivision of frames into fields. Contrastingly, interlaced sources supply two-fields per frame with afield resolution of 720×240 pixels and afield rate of 60 fields per second (or equivalently, 720×480 pixels per frame at 30 frames per second). FIG. 2a further illustrates how a progressive frame 201 represents an instantaneous snapshot of a scene, while an interlaced frame 202 includes fields that are offset both vertically (i.e. by alternating scan lines) and temporally (e.g. by a {fraction (1/60)} of a second delay between fields). Such offsetting is problematic with respect to not only conversion, but also other processing. (Note that fields 202a and 202b have been spaced apart for greater clarity; in actuality, the two field images will overlap in time.)
One conventional approach to resolving the incompatibility between an interlaced video source and progressive-only encoder input has been down-conversion de-interlacing. Two such techniques have traditionally been used. In the first technique decimation—one field of each video frame is summarily dropped from each frame during pre-processing and the remaining field is transferred to the encode-subsystem. In the second technique—averaging—during pre-processing, each interlaced video field pair is summarily combined, then the vertical frame resolution is filtered to avoid resultant aliasing, and then the pre-processed data is transferred to the encode-subsystem.
Unfortunately, both traditional down-conversion de-interlacing techniques, while known to produce generally low-quality results, are nevertheless in widespread use. Conventional decimation tends to produce a reduced quality step-wise appearance or “aliasing,” which is not only generally perceivable, but also becomes even more noticeable at lower display resolutions (e.g. using MPEG-1). While averaging avoids a vertical aliasing problem of decimation, it nevertheless tends to cause blurring and other temporal artifacts.
The second conventional approach to resolving interlace-to-progressive or otherwise de-interlaced input incompatibilities has been to simply replace progressive-only codec standards (e.g. MPEG-1 and its progeny) with those capable of receiving both progressive and interlaced input (e.g. MPEG-2 and its progeny). However, despite the above-noted broad acceptance of such standards, there remains a significant number of legacy devices still in use that incorporate progressive-only encoding. In addition, MPEG-1 and other low-bitrate codec standards are proving useful in traditionally nonmainstream imaging applications, such as video conferencing (e.g. H.26n), among others.
Also unfortunate is that the more well-known up-conversion de-interlacing techniques used to convert interlaced output data for higher resolution progressive display purposes are inapplicable to down-conversion de-interlacing. By way of comparison, up-conversion of an NTSC video signal (FIG. 2b) requires field-to-frame conversion of from 240 lines at l/60 of a second per field to 480 progressive frame lines at {fraction (1/60)} of a second. In contrast, the current source video to encoder input incompatibility (FIG. 2c) requires frame-to-frame conversion of from 480 lines per frame at {fraction (1/60)} of a second to 240 progressive frame lines at {fraction (1/30)} of a second.
Accordingly, there remains a need for apparatus and methods capable of down-converting interlaced video signals into a high-quality, lower resolution signal capable of being coded by a progressive-only video encoder.