Any discussion of documents, devices, acts or knowledge in this specification is included to explain the context of the invention. It should not be taken as an admission that any of the material forms a part of the prior art base or the common general knowledge in the relevant art on or before the priority date of the disclosure and broad consistory statements herein.
Compressing digital picture data for storage and transmission has attracted considerable research over many years. Rapid advances in digital technology has seen an explosive growth in digital image and video content, which is stored, transmitted between locations and shared and used in various internetwork applications, such as social networking and on picture albums. Such growth is expected to continue with many digital applications that have not yet been realised. Many digital applications today are limited by transmission bandwidth and storage space capacity. Although it may be a relatively easy task of increasing transmission bandwidth and storage capacity, the cost and effort required would be uneconomical, especially in the case for transmission bandwidth and given that the resolution and volume of digital image and video content is ever increasing.
Digital images, including still images and moving images, comprise a rapidly growing segment of digital information which is stored and communicated in day to day activities, research, science, government and commercial industries. There are numerous applications for the storage and transmission of digital information, including satellite imaging, medical imaging, tele-health applications, video-on-demand, digital still and video cameras, imaging applications in cloud computing, interactive multimedia applications, video games, advanced security systems and digital television (DTV) systems such as high-definition TV (HDTV) supported by DTV broadcasting.
As resolution of digital imaging equipment increases, the number of digital bits required to represent detailed digital images increases proportionally. This has placed a considerable strain on communication networks and storage devices needed to transport and store these images. As a result, considerable research efforts have been invested into compression of digital images. The ultimate goal of compression is to reduce the number of bits required to be stored and/or transmitted, thereby providing more efficient utilization of bandwidth and storage capacity.
Image compression can fall into two categories, being lossless (reversible) and lossy (irreversible). Lossless compression preserves all information, but is limited generally to a compression ratio gain of between 50% (2:1) to 75% (4:1). In contrast, lossy compression removes a portion of information in images in exchange for increased compression ratio gains leading to degradations in image quality. Therefore, lossy picture compression is seen as a balancing act between rate (compression ratios) versus distortion (picture degradation). That is, what is the distortion level for a given compression rate, or conversely, what is the best possible rate for a given quality (distortion) level. This relationship is functionalized in the rate-distortion (R-D) curve.
Conventional lossy compression methods rely on statistical measures and some primitive heuristics to identify and remove information in images that are considered less important, unimportant or insignificant. However, these approaches are unable to accurately and, consistently identify and preserve visually important information, thereby leading to fluctuations in visible distortions and inconsistent levels of picture quality. Effective solutions to this problem have focussed on incorporating functionalities of the human visual system (HVS) to picture coding systems to provide some control over perceived picture quality relative to compression ratios. This class of picture coders are generally referred to as perceptual image (and video) coders. Even with HVS modelling, visible distortions are inevitable when high compression ratios become a practical necessity due to limitations in transmission bandwidth and/or storage capacity. However, in situations where there are no transmission or storage limitations, then it is highly undesirable to have compressed images which exhibit no visible distortions at all. This is especially true for critical applications such as medical imaging for example. Perceptually lossless image coding is the solution for flawless compression of images.
Both perceptual image coding and perceptually lossless image coding is not truly possible without sufficient understanding of the psychophysical aspects of HVS and the necessary tools to model them. In essence, a HVS model is a core component of any perceptually lossless image coder. Past attempts at perceptual image coding have focussed on experimental/heuristic approach. For example, simple threshold operations, where pixels below certain magnitude range can be removed. More recent HVS models have focussed on low level functional aspects of the human eye and brain and works predominantly towards the JND (Just-Noticeable-Difference) level, i.e., the point at which distortions are just visible. There are three issues to consider when it comes to HVS models for perceptual image coding and perceptually lossless image coding: accuracy; complexity and applicability/adaptability.
Accuracy is dependent of the level of HVS modelling involved. Simple HVS models trade accuracy for simplicity. That is, they are less accurate, but also less complex. Complexity relates not only to computation and implementation, but also the optimization process of the model which is dependent on the number of model parameters. Advanced HVS models have parameters which must be calibrated in order for these models to operate properly. These parameters are optimized based on experimental data obtained through extensive and tedious visual experiments. Applicability is an integration issue of where and how a HVS model can be used. For example, how does one adapt a particular HVS model into existing image encoders? In some cases, HVS models must be adapted to both the encoder and the decoder in order function. Having to make extensive modifications to the encoder, the decoder or both increases complexity, incurs additional cost and time to manufacturers and users of compression technology. Ideally, HVS models should be seamlessly integrated into an encoding system as exemplified by Watson's DCTune, which adaptively selects quantization levels for the JPEG (Joint Photographic Experts Group) coder to match image content for a predetermined level of picture quality.
Existing technologies in the market for HVS based compression have focussed on the JND level, the point at which distortions are just visible. Therefore, current image coding technologies in the market do not have the capability to provide an optimal compression rate with perceptually lossless quality. The issue of image compression at the just-not-noticeable-difference (JNND), i.e., the point where distortions and just imperceptible, was addressed partially in the following articles:    1. D. M. Tan and H. R. Wu. “Perceptual Lossless Coding of Digital Monochrome Images”, in Proceedings of IEEE International Symposium on Intelligent Signal Processing and Communication Systems, 2003;    2. D. M. Tan and H. R. Wu. “Adaptation of Visually Lossless Colour Coding to JPEG2000 in the Composite Colour Space”, in Proceedings of IEEE Pacific Rim Conference on Multimedia, pp. 1B2.4.1-7, 2003; and    3. D. Wu et al, “Perceptually Lossless Medical Image Coding”, IEEE Transactions on Medical Imaging, Volume 25, No. 3, March 2006, Pages 335-344.All three articles disclosed the same perceptually lossless coding core structure applied to monochrome, colored and medical images, respectively. This core structure is based on a monochrome HVS model which was duplicated for color and extended for medical image compression. The duplication for the visually lossless color coder therefore has three identical monochrome HVS model, one for each color channel. The first and the third coder only accept greyscale images and do not operate on color images. Coders disclosed in these articles operate on a quantized-transformed (QT) image prior to bit-plane encoding to generate a set of estimations. These coders model two general aspects of the HVS, visual masking and visual acuity. These two HVS aspects are implemented based on a comparative model which measures the difference between reference signals and manipulated signals. Here, the reference and manipulated signals are represented by the reference transformed (RT) image and QT image, respectively. Initially, visual weighting is applied to both the RT and the QT image to gauge the visual acuity of the HVS. This is followed by visual masking computation for both the QT and the RT image. A set of visual distortions is then generated from the difference between the masking outputs of the RT and the QT images. Thereafter a perceptual error is determined by comparing the distortions levels with subjectively determined thresholds. If a distortion level is below the threshold point, the sample point in the transformed image under evaluation is then filtered to the corresponding (perceptually lossless) threshold point. This system operates in the DWT domain only.
In order to provide an improved performance in compression ratio and/or quality, there is a need to have a more accurate HVS model with low complexity which could be applied to all types of still images and moving images (videos).
The present invention seeks to limit, reduce, overcome, or ameliorate one or more of the above disadvantages by providing an improved perceptually lossless image coding technique. This technique may have a different HVS model that has lower complexity and better modelling accuracy. In addition, a perceptually enhanced image coding component or technique is introduced. The picture enhancement operations may be embedded within an image encoder using this component. This ensures that qualities of compressed pictures are visually better or equal to reference (original/raw) pictures.