1. Field of the Invention
The invention relates to an image processing apparatus, method, and system and a storage medium, in which a copyright can be protected.
2. Related Background Art
Hitherto, a VRML (Virtual Reality Markup Language) is widely and generally used as a language to describe a 3D (three dimension) scene. In a system using such a language, an arbitrary object is arranged in a 3D space, a sight point, a light source, a texture map, and the like are set to thereby construct a scene, and a virtual space with high realism can be formed by adding data such as video/audio data to each object.
In ISO/IEC 14494-1 (MPEG-4 Systems), on the basis of the foregoing VRML, data to describe the scene is reduced and a 3D scene similar to that mentioned above is described by using a BIFS (Binary Format for Scene Description) obtained by using a binary expression—table to convert the VRML. The binarized BIFS data is called a BIFS stream.
Although a detailed binarizing method is not mentioned here, in case of such a BIFS stream, different from a text such as a VRML, it is necessary to reconstruct a scene structure after once decoding the BIFS stream on the display side.
In case of using a texture, video/audio data, or the like, those bit streams are also simultaneously multiplexed and transmitted and received as a single bit stream.
FIG. 1 shows an example of a conventional receiving and displaying system of 3D data.
In the diagram, reference numeral 101 denotes a bit stream receiving unit for receiving a bit stream from a line.
Reference numeral 102 denotes a demultiplexer for extracting each bit stream from the single multiplexed bit stream.
Reference numeral 103 denotes a BIFS decoder (BIFS parser) for decoding scene information to be displayed and forming a scene tree of a 3D object. “Scene tree” denotes information showing layout information of the objects, a mutual dependency relationship, and the like. Reference numeral 104 denotes an image decoder and shows a portion for decoding compressed image code data such as a JPEG file or the like.
Reference numeral 105 denotes a video decoder for decoding code data of video, and 106 indicates an audio decoder for decoding code data of audio.
Reference numeral 107 denotes a scene tree memory for storing the scene tree formed by the BIFS decoder 103.
Reference numeral 108 denotes a renderer which finally arranges a 3D object and a texture and video/audio data which are associated with the 3D object into a 3D space and displays and reproduces them on the basis of the scene tree stored in the scene tree memory 107.
Reference numeral 109 denotes a final output device. For example, image information is displayed on a TV monitor and audio information is reproduced from a speaker.
The bit stream is separated, decoded, and rendered as mentioned above and 3D displayed.
FIG. 2 shows an example of such a kind of bit stream.
Reference numeral 201 denotes a header/info stream in which a header portion and multiplexed information of each stream are written. Reference numeral 202 denotes a BIFS stream in which scene information is described; 203 an image data stream to which texture data or the like is transmitted; and 204 to 209 video/audio streams in which a video stream and an audio stream are alternately multiplexed. In media such as video, audio, and the like which need a real-time reproduction and a synchronization, the video stream and the audio stream are often alternately multiplexed.
FIG. 3 shows an example of the scene tree formed by the BIFS decoder 103. However, various field data is omitted here.
It will be understood that an image texture is adhered to a 3D object box from the scene tree shown in FIG. 3, a movie texture is adhered to a 3D object cylinder, and further, an audio data is reproduced.
FIG. 4 shows a display example in the case where an image, video data, and audio data are rendered on the basis of the scene tree shown in FIG. 3.
It will be understood from FIG. 4 that a 3D object box 401 to which an image texture has been adhered and a 3D object cylinder 402 to which a movie texture has been adhered are displayed and, at the same time, an audio (audio sound or audio data) 403 is reproduced.
It will be obviously understood that not only the still image texture can be mapped but also an audio clip and a video clip can be mapped by the foregoing VRML as mentioned above.
In recent years, there is a tendency of adopting a technique to protect a copyright with respect to the display of such a 3D scene.
Specifically speaking, a method whereby a stream of copyright information is inserted into a bit stream, thereby protecting data such as texture image, video/audio data, or the like on a stream (media stream) unit basis is considered.
According to such a method, the copyright information is previously multiplexed into the bit stream. By using the method, the stream such as video/audio data is protected by the copyright information. Only in the case where the stream is authenticated by descrambling or collating it with a password or the like, the copyright protection is cancelled and the display and reproduction of video/audio data are started. Not only the video/audio streams but also a BIFS stream can be similarly protected as one media stream.
If such a method is used, however, since the 3D object is not defined as a stream, a problem such that the 3D object itself cannot be protected occurs.
It is now assumed as an example that a movie texture on the 3D object cylinder 402 and the audio 403 shown in FIG. 4 are protected.
In this case, after the rendering, as shown at reference numerals 405 and 404, while the movie texture on the 3D object cylinder 402 and the audio 403 are protected, they are not displayed and reproduced obviously. However, the shape of the 3D object cylinder 402 is displayed as it is in a gray color which has been set as a color of a default as shown in FIG. 5.
If the user wants to set such that the 3D object cylinder is not displayed, since the 3D object has been defined by the BIFS stream, the BIFS stream itself has to be protected.
In such a case, however, the 3D object box itself is not displayed in a manner similar to the 3D object cylinder at this time.
Therefore, one approach considered is to previously divide the BIFS stream into every 3D object and protect only the stream which defines the 3D object cylinder. However, it is not easy to divide the BIFS stream and each time the 3D object is moved, modified, extinguished, or newly appears, the BIFS stream corresponding thereto has to be updated or the like, so that a problem arises such that processes become complicated.
In the case of using the VRML, it is also a considered approach to form a VRML file corresponding to each 3D object and describe the whole 3D scene so as to individually recognize each of a plurality of 3D objects. In this case, however, a problem arises such that the VRML file has to be complicatedly formed.