The present invention relates to an image processing method and apparatus for encoding and/or decoding an image to protect the copyright or the like of its rightful holder.
The present invention also relates to an image encoding apparatus and method for receiving and encoding a moving image, and an image decoding apparatus and method for decoding the encoded codes.
Furthermore, the present invention relates to a data processing method and apparatus for processing not only image data but also audio data and, more particularly, to a data processing method and apparatus which are suitable for a case wherein authentication is required for the purpose of copyright protection upon reclaiming predetermined information from a plurality of object streams.
Conventionally, as image coding schemes, coding schemes such as Motion JPEG, Digital Video, and the like as intra-frame coding, and coding schemes H.261, H.263, MPEG-1, MPEG-2, and the like using inter-frame predictive coding are known. These coding schemes have been internationally standardized by ISO (International Organization for Standardization) and ITU (International Telecommunication Union). Intra-frame coding is best suitable for apparatuses which require edit and special playback processes of moving images, since it encodes in units of frames, and allows easy management of frames. On the other hand, inter-frame coding can assure high coding efficiency since it uses inter-frame prediction based on the difference between image data of neighboring frames.
Furthermore, international standardization of MPEG-4 as versatile next-generation multimedia coding standards which can be used in many fields such as computers, broadcast, communications, and the like is in progress.
As such digital coding standards have prevailed, the contents industry strongly recognizes a problem of copyright protection. That is, contents cannot be provided with confidence using standards which cannot sufficiently guarantee copyright protection.
To solve this problem, MPEG-4 adopts an IPMP (Intellectual Property Management and Protect) technique, and a function of suspending/restarting playback of an image to protect copyrights is examined. In this scheme, copyright protection is implemented by inhibiting playback of frames, the copyright of which must be protected.
On the other hand, a scheme and service which provide scrambled images with which the viewers can recognize their outlines have been started. More specifically, sub scrambling is implemented by replacing arbitrary scan lines or pixels in an image signal (television signal). Also, a method of converting a playback image to be output by a playback apparatus is available.
Furthermore, a scalability function is examined, and a method of encoding/decoding images to have a plurality of levels of image temporal and spatial resolutions is available.
However, in a general copyright protection process, the following problems are posed.
(1) Since the conventional IPMP technique stops decoding or playback of an image which must undergo copyright protection, no information can be provided to the viewer at all. This means that no information of the contents (e.g., image) can be provided to an unauthorized viewer (non-subscriber) of that video or the like. Originally, the contents provider wants to distribute contents to more viewers, and contents information must be provided to viewers not entitled to these contents to some extent. That is, in order to distribute contents to more viewers and to obtain more subscribers, the contents provider wants to provide information of given contents even to viewers not entitled to these contents so that they can recognize the contents to some extent.
(2) In the aforementioned image coding schemes, when the entire bitstream is scrambled by the conventional scheme, a viewer who has a decoder that cannot descramble the scrambled bitstream or a viewer not entitled to these contents cannot normally descramble the bitstream, and cannot recognize a video at all.
(3) The aforementioned image coding schemes implement high coding efficiency by exploiting the correlation of images in the space and time directions. When an input image upon encoding is scrambled by the conventional scheme, the correlation of images in the space and time directions is lost, thus considerably impairing coding efficiency.
(4) Furthermore, even when a bitstream is partially scrambled, in a playback image of a moving image coding scheme using inter-frame predictive coding, distortion in a given frame propages to the next frame, and is gradually accumulated. For this reason, the distortion generated is not steady; when a playback image is reviewed on the decoding side, whether distortion is caused by scrambling or is a symptom of another operation error can hardly be discriminated.
(5) In recent years, the process of an image encoding/decoding apparatus is complicated, and software encoding/decoding is often used. In such case, if the load of the scramble process other than the image encoding/decoding process is heavy, the performance of the overall apparatus lowers.
MPEG (Moving Picture Experts Group)-4 mentioned above is a scheme for combining multimedia data containing a plurality of objects such as a moving image object, audio object, and the like, and sending them as a single bitstream. Hence, the receiving side (playback side) of MPEG-4 plays back, e.g., audio and moving picture scenes in association with each other. Such MPEG-4 player must be able to impose various limitations on the use of all or some data to protect the copyrights and the like of their rightful holders.
An MPEG-4 data stream has a function of independently sending/receiving a plurality of video scenes and video objects on a single stream unlike a conventional multimedia stream. As for audio data, a plurality of objects can be decoded from a single stream. That is, the MPEG-4 data stream contains BIFS (Binary Format for Scenes) obtained by modifying VRML (Virtual Reality Modeling Language) as information for compositing these scenes.
Since individual objects required for such scene composition are sent while independently undergoing optimal coding, the decoding side decodes them individually. The player then synchronously composites and plays back scenes by adjusting the time axes of individual data to the internal time axis of the player in accordance with the description of BIFS.
In order to protect copyrights, a process for sending processed data by encrypting data to be sent or embedding digital watermark data is required on the sending side. The receiving side, i.e., the player side acquires information for decrypting (decoding) the encrypted data or information required for authentication using the digital watermark when the user pays a given fee for the copyrights, and reconstructs data containing a desired moving image and audio from the processed data and plays back the decoded data. Upon decrypting the encrypted data or authenticating using the digital watermark, copyright protection is attained by limiting the number of copies of data or inhibiting decoded data from being edited with other objects.
In this way, since the MPEG-4 player composites a plurality of objects, use limitations must be set for individual objects according to copyrights. For this purpose, the present applicant has proposed a system for obtaining authentication information that pertains to copyright use of each object in Japanese Patent Application No. 10-295936.
However, a method of playing back a specific object, the use of which is denied as a result of authentication, or an object, the playback of which is limited since no given fee is paid for copyrights, while lowering its quality (image size, image quality, sound quality, or the like) upon playback has not been proposed.
The present invention has been made in consideration of the aforementioned prior art, and has as its object to provide an image processing method and apparatus, and a storage medium, which scrambles an image signal which requires copyright protection upon encoding an image, allows an image decoding apparatus of an authentic viewer to normally play back an image, and allows an image decoding apparatus of an unauthorized viewer to play back an image so that the viewer can recognize an outline of the image.
It is another object of the present invention to generate an encoded image signal, image contents of which can be roughly recognized even by an apparatus of an unauthorized viewer.
It is still another object of the present invention to allow an apparatus of even an unauthorized viewer to decode encoded image signal so that the viewer can roughly recognize the image contents.
It is still another object of the present invention to partially scramble a bitstream which requires copyright protection upon encoding an image, and to encode an image without any coding efficiency drop.
It is still another object of the present invention to control playback quality of each object on the basis of whether or not the user who is about to restore and play back data is authentic upon restoring and playing back information from a data stream containing a plurality of objects.