The present invention relates to image and video processing and, more particularly, but not exclusively to image and video processing for upsampling, compression, decompression, motion detection and like applications, the processing being based on assumptions about localized behavior of the image data in the probabilistic sense.
Video and image compression reduce the amount of data that needs to be stored or transferred and therefore are instrumental in making feasible numerous digital media applications such as digital TV, streaming over the Internet, digital cameras for personal use and for surveillance, and video for cellular phones. The theoretical topics involved in video and image compression include information theory, signal processing and psycho-visual aspects of human vision.
Video Compression
The need for video compression arises from the fact that uncompressed digital images (or video streams) demand huge amounts of storage and/or network bandwidth. On the other hand most compression is lossy and thus a compressed image is a degraded image. Compression is primarily measured by the quality of performance per bit rate. (hereunder QBR). Other important characteristics are flexibility and scalability, particularly for applications involving streaming. For many applications, a possibility of editing the data without accompanying degradation of the signal is helpful. Such a possibility allows for an ‘encode once deliver anywhere’ capability which is not really available today.
Codec scalability refers to the ability to adapt to variable bit-rates, say for varying levels of available bandwidth. A connection say over the Internet may vary over time in the amount of available bandwidth, and a single image or video source file encoded using scalable codec can be streamed and played-back at different bit-rates for different end devices or over different connections. The ability to provide codec scalability today is limited.
Existing Solutions
Image compression generally works by exploiting redundancy within the image and video compression generally exploits both inter and intra-frame redundancy.
Some solutions, such as vector quantization and matching pursuit, try to exploit such redundancy directly. Other solutions exploit redundancy through the use of transforms such as Direct Cosine Transform (DCT) and the Wavelet transform (WL), which are designed to achieve sparse representation of images. The use of transforms bypasses the complexity issue and the fact that the direct solutions are non-general. Existing solutions can be divided into three groups:
DCT based applications, to which most current commercial compression products belong, for example MPEG 1, 2, & 4; H263, L, 4; Microsoft, and RealNetworks, etc,
Wavelets based applications, for example JPEG2000, and
Others—non-DCT non-scalable applications, for example Matching Pursuit, Vector Quantization, Object Oriented.
Shortcomings of Existing Solutions
DCT based algorithms are most common today and include the industry leaders in the sense of quality per bit rate (QBR). Nevertheless they are far from being a sufficient solution for the crucial bandwidth and storage issues with which the industry is faced. In addition DCT based algorithms lack a natural solution for other important properties such as spatial and temporal scalability and editing capabilities. It is also believed that DCT is at the end of its improvement curve, so that dramatic improvement may only be expected from non-DCT solutions.
Wavelet based applications claim the potential to be comparable to DCT solutions in terms of QBR and also to be scalable. But the potential is not yet implemented in video, mainly because of the shift invariance problem, which is discussed in greater detail below.
As for the third group, as mentioned above, they are non-general and thus it is yet to be seen whether they can deliver reasonable performance for anything beyond a narrow range of solutions.
Requirements for Successful Image Data Compression.
There is a requirement for am image compression technique that is able to answer to the following requirements:
As high as possible quality per bit-rate;
As sharp as possible an upsampling tool;
The ability to provide compressed data streaming that can support fast changes of bit rate,
The ability to retain precision following repeated editing tasks; and
Codec scalability for effective video streaming over packetized networks at a large scale. Scalability must be sufficient to overcome the variations in bit-rate access speeds, end-device resolutions, CPU power, and even variability of bit-rate within a single IP session.
A device and technique optimized for all of the above requirements may be expected to reduce the infrastructure requirement for storage and transport of multiple bit-rate files or expensive trans-rating per channel, and provide best streaming quality per available bandwidth and end device limitations.
There is thus a widely recognized need for, and it would be highly advantageous to have, an image processing device and technique which provides a scalable codec to optimize the above features in a way which is simply not available today.