Since 2017, point clouds have been considered as a candidate format for transmission of 3D data, either captured by 3D scanners, LIDAR sensors, or used in popular applications such as VR/AR. A point cloud is a set of points representing the target object in 3D space. Besides the spatial position (X, Y, Z), each point usually has associated attributes, such as color (R, G, B) or even reflectance and temporal timestamps (e.g., in LIDAR images). In order to obtain a high-fidelity representation of the target 3D objects, devices capture point clouds in the order of thousands or even millions of points. Moreover, for dynamic 3D scenes used in VR/AR video applications, every single frame often has a unique, very dense point cloud, which means the transmission of several millions of point clouds per second is required. For a viable transmission of such large amount of data, compression is often applied.
Two different technologies for point cloud compression are currently under consideration by MPEG. One is 3D native coding technology (based on octree and similar coding methods); the other is 3D to 2D projection, followed by traditional video coding. In the case of dynamic 3D scenes, MPEG is using a test model software (TMC2) based on patch surface modeling, projection of patches from 3D to form 2D images, and coding the 2D images with video encoders such as HEVC. This method has proven to be more efficient than native 3D coding, and it is able to achieve competitive bitrates at acceptable quality. FIG. 1 illustrates some steps of this method as practiced prior to the present invention, showing a representation 102 of the 3D object of interest, the classification or breakdown of this representation into 3D surface patches, 104, the projection of these patches onto a 2D image plane to form 2D patches such as patch 106, 2D image packing 108, and a background filling scheme, including patch dilation, as shown in 110. These steps will now be discussed in more detail.
When processing point clouds, TMC2 classifies the points according to the direction of their normal, and group connected components with similar classification. This results in three-dimensional patches of surface that are then projected onto a 2D axis-aligned plane, whose orientation depends on the classification of the points in the patch. The projections of the patch surface serve two purposes: to represent the position of the points in 3D space by recording the distance of the point to the projection plane, and to represent their respective color values. Each 2D projected patch is placed into a 2D image, resulting in one sequence with depth images, and another sequence with texture (color) values (RGB). Notice that the projected data do not cover all the pixels in the 2D image. For those positions that are not covered, a dilation operation will fill in the missing positions. In the case of the depth sequence, an HEVC video codec typically packs the depth images into the luminance (Y) channel of a video stream and compresses the sequence of depth images as a regular YUV420 sequence. In some cases, instead of HEVC, another video encoder such as H.264 may be used. Similarly, in some cases, depth information could be broken down into three planes, coded with 24 bits, and packed into Y, U and V channels.
For the texture (RGB) data, TMC2 first converts the RGB data into YUV420, then codes the converted video sequence. For the texture, color sub-sampling commonly generates artifacts such as color leaking for sharp colored edges. Unnatural sharp color edges may occur due to the projection of patch surfaces to 2D axis-aligned planes. Furthermore, the placement of patches in the 2D image can also generate artificial color edges between patches. Color leaking artifacts in such color edges can appear in the 3D point cloud as visible color artifacts, reducing the quality of the reconstructed point cloud. Hereafter, in this disclosure, the term “chroma”, a standard term, well understood to those of skill in the art of video compression, is used interchangeably with “color”.
Color artefacts in the projected images may be visible between and/or within the 2D patches. FIG. 2 illustrates the issue of inter-patch color leakage. Image 202 is uncompressed, image 204 has been coded using a high resolution (“444”) color sub-sampling scheme, and image 206 has been coded using a more standard, lower resolution (“420”) color sub-sampling scheme. Image block 208 shows how part of the starting 3D image may be broken up into 3D surface patches, and image block 210 shows the projection of such patches onto a 2D plane before either of the “444” or “420” coding schemes is applied. The “420” coding scheme may be considered preferable to the “480” scheme in the context of efficient data transmission, as the more highly compressed data can be transmitted faster, and/or on a less expensive transmission channel. However, a zoomed in view 214 of part of the “420” coded image shows significant color leakage, while a corresponding zoomed in view 212 of the “444” coded image shows negligible color leakage. In other words, the demands of efficient data transmission create inter-patch color artefact problems.
FIG. 3 similarly illustrates the issue of intra-patch color leakage. A single patch is shown at image block 302, coded using a “420” color sub-sampling scheme; the same patch coded using a “444” scheme is shown at image block 304. Zoomed in views of the projected, 2D versions of these patches are shown at blocks 306 and 308 respectively. The same parameters (lossless texture, lossy geometry) were used in setting up HDR tools in both cases. The scheme resulting in more highly compressed data is seen to result in significant color leakage between edges within the patch (see the regions bounded by ovals in 306), while the corresponding edges produced by the lower compression scheme (seen in 308) are free of such leakage problems.
There is therefore a need to mitigate the color leaking artifacts, both inter-patch and intra-patch, that are typically produced due to chroma sub-sampling during the process of 3D point cloud coding.