Compared with digital words or text, visual objects provide more perceivable information but also require more data for transmission and storage: “a picture is worth a thousand words.” To facilitate uses of digital images, many image compression techniques have been developed for representing images compactly. Image compression is one of the most key technologies in the development of various multimedia applications.
Attempts have been made to develop compression techniques that rely on identifying and utilizing visual features within images to achieve high coding efficiency. Characteristics responsive to the human visual system (HVS) are incorporated into coding methods to try to remove some of the visual redundancy inherent in images and enhance visual quality of resulting images. Development of such coding schemes is greatly influenced by the availability and effectiveness of related techniques, such as edge detection and segmentation.
Recently, vision-related technologies have shown remarkable progress in hallucinating images with good perceptual quality. Attractive results have been achieved by newly presented vision technologies, such as feature extraction, image completion, and super-resolution. New ways to represent images are based on primitive visual elements, such as edge, color, shape, texture, and other visual features. Essentially, image compression schemes and vision systems face a similar problem, that is, how to represent visual objects in efficient and effective ways. The new ways to represent images are based on the possibility of applying certain vision technologies to compression systems to achieve perceptual quality rather than pixel-wise fidelity.
It is promising to significantly reduce visual redundancy on the basis of current transform-based coding schemes, exemplified by the success of applying image inpainting technologies to image coding. Moreover, compression systems greatly benefit when vision methods are introduced into data compression. On the one hand, as complete source images are available in compression systems, new vision technologies can fully exploit all the available source information. On the other hand, computer vision and graphic technologies may lead to new ways to explore visual redundancy in images during pursuit of good perceptual quality.
Although there is a large volume of knowledge on image compression, the majority of image coding techniques are based on transform methods. A conventional image compression system generally has an encoder module that consists of a prediction model, a linear transform (such as DCT or DWT), a quantizer, and an entropy encoder. There is a corresponding decoder. Such frameworks have been widely employed in many compression systems and standards, such as JPEG. Statistical redundancy inherent in images is utilized and is based on classical information theory that pursues compact representation of images.
Compression systems have also been developed by identifying visual features or applying learning models at the encoder and/or decoder to achieve high performance. Such coding systems typically embody an additional module for feature extraction or learning tools in the coding process. Conventional compression techniques assisted by the features (such as edge) or tools (such as neural networks) provide better adaptation and efficiency for data compression. Conventional learning-based compression schemes require the same training method and statistic models to be used on both encoder and decoder-sides. To meet this requirement, certain kinds of additional information that clarify the form of the operative model and related parameters need to be obtained on one side and transmitted to the other, which sacrifices the coding performance; or else generated on both sides by an identical procedure including model, parameter, input data, etc.—which greatly limits applicability.