1. Field of the Invention
The present invention generally relates to digital encoding of images and, more particularly, to encoding, with compression, of sequences of images to be reproduced in rapid succession to produce the illusion of motion, such as for digital transmission of motion pictures or animated graphics.
2. Description of the Prior Art
For purposes of communication, digital signalling is currently much preferred to analog signalling in most environments and applications. Consequently, communications infrastructure is rapidly being converted to carry digital signals. Reasons supporting such a strong preference are much increased bandwidth and transmission capacity, decreased susceptibility to noise and the possibility of strong error correction to compensate for transmission losses. Accordingly, it is now possible to transmit relatively massive amounts of data economically and in short periods of time.
One such application which is rapidly becoming familiar and a source of substantial economic interest is the digital transmission of pictorial images and graphics. In particular, the transmission of images at high data rates sufficient to achieve the illusion of motion such as is encountered in animated graphics and motion pictures is now commercially feasible and coming into relatively widespread use. However, to do so, a sequence of images must be presented at rates above the so-called flicker fusion frequency of human visual perception, generally accepted as being about twenty-four to thirty images per second.
Further, digital image data must contain a very large amount of information to achieve good image quality and fidelity. The amount of data in a single image may contain several million image points or xe2x80x9cpixelsxe2x80x9d, each of which must be encoded to represent fine gradations of both color and intensity. Thus it can be seen that even a single, very short sequence of digitized motion picture could require the equivalent of billions of bytes of data to be transmitted and/or stored.
In order to accommodate such massive amounts of information with commercially available and sufficiently inexpensive hardware to be used by persons desiring such information or the general public at large and to efficiently and economically utilize the communication infrastructure, it is necessary to reduce the volume of data by compression. Several standards for image data compression have been proposed and widely adopted. Among the more well-accepted standards for compression of image data are the JPEG (Joint Photographic Experts Group) standard and the MPEG (Motion Picture Experts Group) standard, both of which are known in several versions at the present time.
The JPEG standard allows optimal resolution and fidelity to be maintained for any arbitrary degree of data compression and compression by a factor of twenty or more often does not result in loss of image quality or fidelity which is generally perceptible. The MPEG standard is similar to the JPEG standard in many aspects but also allows redundancy of portions of the image from frame to frame to be exploited for additional data compression. This process is enhanced by different encoding and decoding techniques being applied for independent frames (I-frames) which are compressed independently of data in other temporally proximate frames, interpolated frames (P-frames) compressed in terms of changes from a preceding I or P frame and frames which are bidirectionally interpolated (B-frames) between preceding and following I or P frames.
The high degree of compression with minimal loss of fidelity is enhanced in accordance with these and other standards by providing flexibility of coding in dependence upon image content. A powerful concept in this process is entropy coding; so-called because, in a manner somewhat parallel to the concept of entropy in the more familiar thermodynamic context, it represents a measure of the disorder within the image as a metric for assignment of particular codes to particular image values on the well-founded assumption that less common values contain greater amounts of information justifying greater numbers of bits and that more common image values contain relatively less information and can (and should) be represented by smaller numbers of bits in the coded data. However, to determine how image data values in a given image (or portion thereof since coding tables can be changed within an image) are encoded, it is necessary to accumulate statistics concerning the image values in an image before code values can be analyzed and efficient code assigned to respective values. In other words, a substantial portion of the encoding process must be completed and the results analyzed before it can be known which codes can be most efficiently assigned to image values representing regions within the image.
(As a matter of terminology, it will be understood by those skilled in the art that xe2x80x9cpixel valuesxe2x80x9d such as luminance and chrominance of the individual pixels of the image are transformed in groups, called macroblocks, by an orthogonal transform process such as a discrete cosine transformation to yield values which represent the image in terms of spatial frequency and which are referred to herein as xe2x80x9cimage valuesxe2x80x9d. This processing has the effect of providing image values which may have a reduced number of significant bits and which may often be reduced or zero bits removed by truncation without perceptible reduction in image fidelity since human visual perception is relatively less sensitive to high spatial frequencies. At the same time, image values representing low spatial frequency, to which the human eye is also somewhat insensitive, may be more common but represented by fewer bits through entropy encoding. However, the particular preprocessing is not important beyond the fact that substantial preprocessing must be performed and the results analyzed before the details of a relatively optimal encoding process can be determined.)
In the past, it has been the practice to perform encoding in a pipelined fashion with each discrete processing step being performed on the results of a preceding step. However, this approach may require a process to be performed for an entire frame before a following process can be started and thus introduces latency in the data which may cause synchronization problems. Encoders adequate for television data rates (which are of lower resolution than may be desired) and using pipelined architectures have been developed and are currently available but exhibit such latency and may cause such synchronization problems, particularly where the encoding requires extra bits to be used or quantization table(s) to be changed; both of which increase the number of bits which must be transmitted. However, conditions such as extra bits and frequent changes of quantization tables are more likely to occur when increased image quality, fidelity and/or resolution is required.
Preprocessing of the image values is thus often used to predict encoding options for optimized picture quality. Since encoder output provides the most accurate information concerning the image content, encoders can be used as preprocessors. Cascade encoding using a plurality of encoders in stages has been used to improve picture quality. The silicon/chip size, circuit power and evenness of picture quality depends oh the amount of information and output statistics that are provided to the second encoding stage and, in such an environment, first stage encoder/preprocessor statistics must be extracted and collected from the first stage encoder and then converted to the host interface data format and fed to the second stage encoder. Such a system is often referred to as a two-pass system and supports use of image value statistics for choice of encoding options on the same frame (as distinct from a so-called one-pass system which uses statistics from one frame for coding of a following frame for which they may not be optimal or even appropriate and which thus cannot optimize encoding of any frame or field based on the actual content of that frame or field).
However, these pipelined and data transfer processes require extensive support in both hardware and processing, particularly for synchronization of data transfer and encoder functions and buffering, and, hence, increase circuit complexity and, generally, image data latency even though some hardware economies may be realized in regard to the encoders themselves since commercially available encoders may be used. This additional, multiple function overhead to coordinate multiple encoder pipelines and data transfer functions is often comprehensively referred to as xe2x80x9cexternal glue logicxe2x80x9d and which may be quite extensive and may significantly increase data latency as well as overall encoder complexity and cost. There has been no alternative to pipelining and the preprocessing, latency and extensive external glue logic that pipelining implies when supporting optimal choice of encoding options based on image content in a twopass encoder arrangement.
It is therefore an object of the present invention to provide a simple, compact and economical encoding and/or data compression system having low, programmable latency and the capacity to support prediction of coding options for optimized decoded data (or image) quality or fidelity with reduced synchronization processing overhead.
It is another object of the invention to provide an encoding system utilizing an encoder for data pre-processing and control of another encoder, particularly for image data with simplified control of data transfer without external glue logic.
It is a further object of the invention to effectively provide a two-pass system for optimal encoding and/or compression of each frame of image data in substantially real-time in a simplified manner with reduced processing and hardware support.
In order to accomplish these and other objects of the invention, an encoding system is provided including a first encoder functioning as a preprocessor for collection of statistics concerning input data, a second encoder for receiving the collected statistics concerning input data, selecting between encoding options responsive to the statistics and encoding data in accordance with the selected options, and an arrangement for autonomusly transferring the statistics from the first encoder to the second encoder whereby encoding is optimized for current input data without external glue logic.
In accordance with another aspect of the invention, a data encoding/compression method is provided comprising steps of providing input data in parallel to a plurality of encoders, partially processing the input data to derive partially processed data, collecting statistics concerning the partially processed data in a first encoder, autonomously transferring the statistics to a second encoder, and further processing the partially processed data in accordance with the statistics.