Piecewise smooth spatial data sets, such as images that represent depth or elevation data, arise in a variety of applications where compression is required. Typical image compression schemes rely on transforms that exploit the spatial correlation among samples in a neighborhood. While these transforms are able to exploit spatial redundancy in smooth regions, they perform poorly at the vicinity of discontinuities. In certain applications the treatment of discontinuities takes on special significance; for example in case of depth maps, the accurate treatment of discontinuities becomes more important when depth information is used to infer 3D geometric structure. In this instance, even small errors in depth at the vicinity of object edges can produce large errors in imagery that is synthesized using the inferred structure.
For many applications, in addition to compression performance, features such as resolution scalability and embedded coding are highly desirable. JPEG 2000 offers these scalability features and has been found to be beneficial for the interactive communication of terrain elevation data, as well as depth maps for image-based rendering. However as noted earlier, problems are encountered in the vicinity of discontinuities in the depth map.
Considering the significance of object boundaries to depth data, previous work has focused on incorporating in some way object geometry into the compression scheme. The motivation being that geometry information can be used to appropriately adapt the local basis function that is employed for transformed based image coding. Significant performance improvements can be gained by ensuring that the basis functions do not cross sharp object boundaries.
Prior work has explored the option of first segmenting objects from the depth map and then compressing the smooth regions within each segment while separately describing the object boundaries using various methods. Unfortunately segmentation is not a well-defined operation and the proposed boundary description schemes do not provide a scalable and embedded representation of object boundaries. In general, boundaries of segmented objects are conveyed by first assigning labels to each sample location in accordance with the segmented region to which it belongs; these labels are then coded using schemes that exploit local context (e.g. arithmetic coding). Such an approach does not facilitate scalable decoding and more importantly do not allow for embedded coding—it makes no sense to apply embedded quantization and coding schemes to labels. Another important limitation of segmentation based approaches is that segmentation is often performed as a pre-processing step, prior to coding and therefore it is difficult to subject segmentation decisions to rate-distortion considerations.
Alternative schemes for depth map coding that avoid the need for object segmentation have also been pursued. In one example, discontinuities in the depth map are described using a quad-tree representation where leaf nodes of the tree are allowed to model discontinuity boundaries. This allows for a piecewise description of discontinues which can be constructed subject to rate-distortion considerations. In more recent work the initial segmentation step is replaced with an edge detection phase where connected edges are prioritized in accordance to their impact on rate-distortion performance and then coded using a chain coding algorithm. While the above schemes have advantages over purely segmentation based approaches, the issues of resolution scalable decoding and embedded geometry representation are not explored.
At a broader level, prior attempts at incorporating image geometry to spatial transforms include work on directional DWT and bandlets. In both these cases, a block based description of dominant orientation is required. For the case of directional DWT non-overlapping, variable size, blocks describe the dominant orientation in the image domain while for the case of bandlets block based descriptions are used to convey dominant orientation of 2D DWT coefficients. Block based description of geometry is not optimal at object boundaries, especially when the boundary contour is irregular or far from a simple linear representation. DWT and bandlets are designed to be responsive only to the dominant or average orientation of an object discontinuity within a given region.