Image processing systems usually include, among other components, an encoder that encodes image data and a decoder that reconstructs an image using the encoded image data. In many systems, the encoder applies wavelet transforms on raw image data to divide and derive sets of data, referred to as wavelet coefficients, at different spatial resolution. In the wavelet decomposition process, coefficients are computed for pixel values within the image, as the image is progressively broken down into lower frequency and resolution images. In decoding or reconstruction, then, a resolution may be selected that provides more or less detail by selecting the appropriate level of wavelet decomposition. These techniques may also spatially subdivide the original image into a number of subregions to obtain multi-resolution representation of the image in which a desired resolution may be selected for a reduced region of interest as opposed to the entire image.
Theoretically, for each subregion, any level of wavelet transforms may be applied by the encoder. Ultimately, this process could be continued until a single coefficient at a lowest frequency level after which further decomposition will not be possible. There is typically no need for very low levels of decomposition, and the process may be usefully stopped at a desired level. Similarly, the subdivision of the image into subregions could continue until the image is divided into the original number of pixels, although the useful degree of subdivision, or number of spatially relevant regions “N” generally lies between the original image size and this limit.
In many applications, such as surveillance systems and video conferencing systems, it is often desirable to work with certain portions of the image instead of the entire image. Such portions of the image are generally referred to as regions of interest (ROIs). ROIs typically include the more important information pertaining to the image, at least for the purposes of the viewer. The use of ROIs or subregions, each wavelet encoded, enables more detailed information to be added for the subregion by reconstructing that subregion using higher frequency or resolution data (i.e., from a higher level of the decomposition).
The ROI is usually defined during the decoding process, in which a portion of the image pertaining to the ROI is selected and the additional data needed for representing the ROI in greater detail is utilized for reconstruction of just that region, and the quality of a background region may decreased, at least as compared to the ROI. In order to support ROI scalability, all of the N subregions are required to be wavelet encoded, typically with individual header information, which results in additional processing by the encoder, and storage of additional information. On the other hand, if only a few subregions are encoded, that is, the value of “N” is lowered to decrease the processing overhead, the encoder may not be able to support ROI scalability, or a very limited selection of individual regions will result.
Several techniques are currently available to obtain ROI scalability with wavelet encoding. Such techniques include a maximum shift method and generic scaling. Both techniques place bits associated to the ROI in higher bit planes and shift the bits associated to the regions other than the ROI or background regions, to lower significant bit planes. One disadvantage of this technique is that it cannot flexibly control a relative importance between the ROI and the background region by adjusting the scaling values. In other words, no information about the background regions can be received by a decoder until all the information about the ROI is decoded. In addition, such techniques may provide for ROI selection at the encoder and provide little or no flexibility to a user to interactively select the region of interest from the original image.