The present invention relates to a video scramble/descramble apparatus corresponding to motion predictive/orthogonal transform coding of videos.
Various encryption techniques have been studied and developed to prevent unauthorized duplication and unauthorized access for the purpose of protecting the copyrights of products containing audio or video information.
For example, in a DVD (Digital Versatile Disc) using MPEG2 video coding, reconstruction regions are limited by region codes, and coded data is encrypted by a CSS (Contents Scrambling System).
As a scramble technique for a baseband video signal, techniques called line rotation which randomly sets one cut point per line and replacing the right and left line sections of the cut point and line permutation for randomly replacing scan lines are known. Line rotation is used to limit access in cooperation with a billing system as a scramble technique for pay-per-view programs for satellite broadcast and CATV (cable television).
For the purpose of preventing unauthorized duplication by consumer analog video tape recorders, a copy protection technique (Microvision Corporation) is known. In this technique, AGC signals or color stripe signals in the vertical blanking interval are manipulated to disable normal recording of copy-protected tape contents by the VTRs, although such copy-protected tape contents can be normally displayed on a TV.
Furthermore, a technique called xe2x80x9cdigital watermarkingxe2x80x9d corresponding to digital contents including audio or video information is known. Digital watermarking embeds data, which cannot be visually or aurally perceived, in a baseband signal or coded data of audio or picture data, or the like. Information to be hidden by digital watermarking includes, for example, copyright information, copy generation management information, playback control information, scramble key information, and the like.
The aforementioned techniques have both merits and demerits. For example, management using region codes unconditionally allows playback in designated regions, and data encryption by a CSS or the like does not inhibit playback using an authorized player. Hence, the region code or CSS can prevent coded data itself from being duplicated, but cannot prevent unauthorized duplication of a decoded video signal. On the other hand, the duplication protection system for analog VTRs depends on models of VTRs, and cannot always assure the duplication protection effect. In addition, since only sync signals are manipulated, resistance against unauthorized attacks is not always high. Furthermore, hiding of copyright information by, e.g., digital watermarking does not always technically limit prevention of unauthorized duplication of a video signal.
More specifically, in order to prevent unauthorized duplication of a video signal, more robust copyright protection method for the video signal itself must be used. However, when a conventional video scramble technique such as line rotation or the like is used, if the scrambled video signal is coded by MPEG2 which is used in a DVD or digital broadcast, the coding efficiency lowers compared to coding of a non-scrambled picture, thus deteriorating the picture quality of the reconstructed picture. This is because the conventional video scramble makes an original video picture hard to discern by lowering temporal spatial correlation of the picture by random manipulation of the picture, and is contradictory to motion predictive/orthogonal transform coding such as MPEG2 or the like, that improves coding efficiency using the temporal spatial correlation of a picture.
This point will be described in more detail below.
MPEG2 coding uses correlation of a video signal in the space domain (intraframe correlation) and correlation in the time domain (interframe correlation), and compresses the data size by removing redundancy in both these domains. Motion prediction in units of blocks anticipates an effect of reducing video signal power using interframe correlation. To reduce the data size by the DCT (discrete cosine transform) and variable-length coding in consideration of correlation between neighboring pixels in a frame and also quantization with weights depending on frequency in consideration of the nature of human vision, or to variable-length code only the difference between DC components of neighboring blocks anticipate reduction of video signal power using intraframe correlation.
Furthermore, upon coding motion vector information in units of macroblocks, the difference between the motion vectors of neighboring macroblocks is variable-length coded as a motion vector to be coded in consideration of motion similarity between frames in association with neighboring macroblocks. In this manner, the information size to be transmitted can be reduced.
However, in the conventional video scramble technique, correlation is lowered or video contents are made hard to recognize by random manipulations for the video signal. When a video signal that has undergone processes such as conventional line rotation, line permutation, or the like is coded by MPEG2, interline correlation in a frame considerably lowers, and a reduction of signal power can no longer be expected in a combination of DCT and variable-length coding.
When vertical motion components exist in the time domain, even when an original video picture has predictive efficiency in motion prediction in units of macroblocks, the similarity between a reference picture and picture to be coded lowers as a result of scrambling, and the predictive efficiency considerably drops. More specifically, the correlation of a video signal expected in MPEG2 coding considerably lowers, and it consequently becomes hard to reduce video signal power. In order to achieve coding at a predetermined bit rate, the number of coded bits must be reduced by coarse quantization, resulting in drop of image quality of the decoded picture.
As described above, as robust copyright protection method for a video signal, a scramble process for a video signal itself is effective. However, when the conventional video scramble technique is combined with coding such as MPEG2 that uses temporal spatial correlation, the coding efficiently suffers, resulting in deterioration of image quality of the reconstructed picture.
It is an object of the present invention to provide a video scramble apparatus and video descramble apparatus, which are free from deterioration of image quality even in coding as a combination of motion predictive and orthogonal transform like MPEG2 coding.
It is another object of the present invention to provide a video scramble apparatus which can implement video scramble that can minimize coding efficiency drop and can maintain high image quality by selecting a frame which is not used as a reference picture in interpicture predictive coding, i.e., interfield or interframe predictive coding and scrambling the selected frame using one or both of pixel replacing in units of m slices in a predetermined vertical range or pixel replacing in units of n consecutive macroblocks within a predetermined horizontal range.
According to the first aspect of the present invention, there is provided a video scramble apparatus comprising a scramble unit which scrambles a video signal, and a coding unit which performs interpicture predictive coding of the video signal scrambled by the scramble unit, wherein the scramble unit selects a picture, which is not used as a reference picture for interpicture prediction in the coding unit, from the video picture signal, and replaces slices as sets of macroblocks located on identical scan lines in the video picture signal of the selected picture in units of m slices which are consecutive in a vertical direction in the picture.
In MPEG2 coding, a picture which is not used as a reference picture means all B-pictures (bi-directional predictive coded pictures), I-pictures (interframe coded pictures) which are not referred to from other frames, and P-pictures (forward predictive coded pictures) which are not referred to from other frames.
Since the video scramble apparatus implements scrambling by replacing only a picture which is not used as a reference picture of interframe predictive coding in units of slices in the vertical direction, a predictive signal of a coded macroblock can be extracted from an appropriate position of the reference picture like in normal coding, thus preventing motion predictive efficiency drop.
Since in MPEG2 coding intraframe correlation is used only in a block and only between blocks in a slice, the intraframe correlation never lowers. Furthermore, upon motion vector coding, since differences are coded in units of neighboring macroblocks in a slice except for the head position of the slice, the motion vector differences become constant irrespective of the presence/absence of scrambling except for the head position of the slice, and the number of coded bits of motion vector data can be prevented from increasing.
In MPEG2 coding, the variable-length coding scheme upon coding motion vector data is determined based on the maximum values of horizontal and vertical components of motion vectors in the frame, and as the maximum values become larger, the code length increases. Hence, when slices are replaced arbitrarily, the maximum value of vertical components of motion vectors increases, and the number of coded bits of motion vector data increases. However, upon replacing a predetermined number of slices within a group including these slices, an increase in vertical component of the motion vector can be suppressed to be equal to or smaller than a predetermined value, and the number of coded bits of motion vector data can be minimized. Upon replacing slices, when motion vectors with respect to a reference picture are detected from macroblocks in the replaced slice, motion vectors are preferably found by search from a broad range in the vertical direction in correspondence with an increase in motion amount corresponding to replacement of slices.
According to the second aspect of the present invention, there is provided a video scramble apparatus comprising a coding unit which performs inter-picture predictive coding of a video signal, and outputting first coded video data; and a scramble unit which scrambles the first coded video data output from the coding unit, wherein the scramble unit selects second coded video data corresponding to a picture, which is not used as a reference picture for inter-picture prediction in the coding unit, from the first coded video data, and replaces the selected second coded video data corresponding to slices in units of m slices which are consecutive in a vertical direction in the picture, the slices being sets of macroblocks located on an identical scan line.
In this manner, in the video scramble apparatus of the second aspect, after video coding for the input video, e.g., MPEG2 coding, slices in a frame are replaced on the level of coded data as in the video scramble apparatus of the first aspect, thus obtaining coded video data which has undergone scrambling equivalent to that by the video scramble apparatus of the first aspect. In this case, the motion vector search range need not be broadened upon scrambling, and a motion vector search can be made within a normal search range.
According to the third aspect of the present invention, there is provided a video scramble apparatus comprising a coding unit which performs inter-picture predictive coding of a video signal, and outputting first coded video data; and a scramble unit which scrambles the first coded video data output from the coding unit, wherein the scramble unit selects second coded video data corresponding to a picture, which is not used as a reference picture for inter-picture prediction in the coding unit, from the first coded video data, and replaces the selected second coded video data corresponding to slices in units of m slices which are consecutive in a vertical direction in the picture, the slices being sets of macroblocks located on an identical scan line, and the scramble section includes a multiplexer which adds an offset to a vertical component of a motion vector of each of the macroblocks constituting the slices in accordance with the replacement of the coded video data and multiplexes an added result to the coded video data.
More specifically, in the video scramble apparatus of the third aspect, offset addition to the vertical component of a motion vector of each macroblock is added to the scramble unit in the video scramble apparatus of the second aspect.
The video scramble apparatus of the third aspect can obtain the following effects in addition to the same effects as those of the video scramble apparatus of the second aspect. More specifically, in combination with replacement of coded data in units of slices, for only a macroblock such as a first macroblock of each slice, which is coded without coding the difference between motion vectors, coded data of that motion vector is replaced by coded data of a motion vector added with a vertical offset upon replacing slices. In this manner, video scrambling equivalent to that in the video scramble apparatus of the first aspect can be implemented by only processes for coded data obtained by directly using a conventional video coding system.
According to the fourth aspect of the present invention, there is provided a video scramble apparatus comprising a scramble unit which scrambles a video signal, and a coding unit which performs interframe predictive coding of the video signal scrambled by the scramble unit, wherein the scramble unit selects a picture, which is not used as a reference picture for inter-picture prediction in the coding unit, from the video signal, performs first division of macroblocks located on an identical scan line in the video signal of the selected picture in units of m consecutive macroblocks, performs second division in units of n consecutive macroblocks (n less than m) within the m consecutive macroblocks obtained by the first division, and replaces macroblocks in units of n consecutive macroblocks obtained by the second division within the m consecutive macroblocks obtained by the first division.
When scrambling is to be done by replacing macroblocks in the horizontal direction in an identical slice in a frame which is not used as a reference picture, the motion vector values and difference values between neighboring macroblocks upon replacing macroblocks become large, as described above, and as a consequence, the picture quality may often deteriorate due to coding efficiency drop. Especially, when macroblocks are randomly replaced, the effect of calculating the difference between the motion vectors of neighboring macroblocks is lost, and the offset of the motion vector increases to a value around the horizontal size of the screen at maximum.
By contrast, the video scramble apparatus according to the fourth aspect performs first division in units of m macroblocks, which succeed in the horizontal direction, performs second division for further dividing each of macroblock groups obtained by the first division in units of n consecutive macroblocks (m greater than n), and replaces macroblocks in units of n macroblocks obtained by the second division within each macroblock group obtained by the first division. In this manner, the offset to be added to the horizontal motion vector of each macroblock upon replacing macroblocks is limited by the first division size.
As for the difference between horizontal motion vectors of neighboring macroblocks, the difference normally increases at the head of the set of macroblocks obtained by the second division, but does not increase at positions other than the head of the second division. More specifically, according to the video scramble apparatus of the fourth aspect, video scrambling can be implemented by replacement of horizontal macroblock sets without considerable drop of coding efficiency (deterioration of image quality of the reconstructed picture) by suppressing an increase in the number of coded bits of motion vector data in MPEG2 coding.
According to the fifth aspect of the present invention, there is provided a video scramble apparatus comprising a coding unit which performs inter-picture predictive coding of a video signal, and outputting first coded video data, and a scramble unit which scrambles the first coded video data output from the coding unit, wherein the scramble unit selects second coded video data corresponding to a frame, which is not used as a reference picture for inter-picture prediction in the coding unit, from the coded video data, performs first division of macroblocks located on an identical scan line in the selected second coded video data in units of m macroblocks, performs second division of macroblocks in units of n macroblocks (n less than m) within the m macroblocks obtained by the first division, and replaces the second coded video data corresponding to macroblocks in units of n macroblocks obtained by the second division.
According to the video scramble apparatus of the fifth aspect, after, for example, MPEG2 coding is done using a video signal before scrambling as in the video scramble apparatuses of the second and third aspects, macroblocks are replaced on the level of coded data, thus obtaining coded video data that has been scrambled.
According to the sixth aspect of the present invention, there is provided a video scramble apparatus comprising a coding unit which performs inter-picture predictive coding of a video signal, and outputting first coded video data, and a scramble unit which scrambles the first coded video data output from the coding unit, wherein the scramble unit selects a picture, which is not used as a reference picture for inter-picture prediction in the coding unit, from the video signal, performs first division of macroblocks located on an identical scan line in the video signal of the selected picture in units of m consecutive macroblocks, performs second division in units of n consecutive macroblocks (n less than m) within the m consecutive macroblocks obtained by the first division, and replaces macroblocks in units of n consecutive macroblocks obtained by the second division within the m consecutive macroblocks obtained by the first division, and the scramble unit includes a multiplexer which adds an offset to a horizontal component of a motion vector of each of the macroblocks in accordance with the replacement of the coded video data and multiplexes an added result to the coded video data.
In the video scramble apparatus according to the sixth aspect, offset addition to the horizontal component of a motion vector of each macroblock is added to the scramble unit in the video scramble apparatus according to the fifth aspect.
The video scramble apparatus according to the sixth aspect can obtain coded data that has undergone scrambling equivalent to that of the video scramble apparatus according to the fourth aspect by only processes for coded data, which is coded by a normal video coding system.
In a video scramble apparatus according to the seventh aspect of the present invention, at least one of video scramble apparatuses according to the first to third aspects is combined with at least one of video scramble apparatuses according to the fourth to sixth aspects.
Since replacement of slices in the vertical direction and replacing in units of n consecutive macroblocks in the horizontal direction are nearly free from coding efficiency drop, as described above, they may be combined to implement video scrambling as in the video scramble apparatus according to the seventh aspect. By combining these scramble schemes, more robust video scrambling can be implemented. That is, by increasing the number of scramble schemes to be combined, resistance against unauthorized attacks can be strengthened, and an effect of making an original video picture hard to recognize can be improved as scramble manipulations becomes more complicated.
By controlling horizontal and vertical scramble patterns or their combinations according to the present invention, resistance against unauthorized attacks and the way a picture looks can be controlled in correspondence with application""s requests.
Video data scrambled by the video scramble apparatus of the present invention is sent to a transmission system. A storage medium may be used as a transmission system, scrambled video data may be recorded on that storage medium, and may be descrambled upon playback. A transmission line such as a terrestrial wave, satellite, cable, Internet, or the like may be used as a transmission system, and scrambled video data may be transmitted and descrambled in real time via such transmission system.
In a video scramble apparatus according to the eighth aspect of the present invention, the video scramble apparatus according to one of the first to seventh aspects further comprises a replacing pattern generator for generating a slice or macroblock replacing pattern for scrambling in the scramble unit, a descramble key generator for generating the replacing pattern or initial data for generating the replacing pattern as a descramble key, and a multiplexer for multiplexing the descramble key on at least one of video data to be coded by the coding unit, a video signal scrambled by the scramble unit, coded video data obtained by the coding unit, and audio data multiplexed with (or associated with) the encoded video data.
The replacing patterns of slices in the vertical direction and replacing patterns in units of n consecutive macroblocks in the horizontal direction may be determined based on random patterns generated by the scramble apparatus. When the random pattern itself, a random pattern generator, or its initial value is sent to an authorized receiver as a key for descrambling (descramble key; secret key), descrambling can be achieved on the receiving side.
The descramble key or a part of the descramble key can be sent via a route different from that of coded video data, e.g., via an IC card, telephone line, or the like. A part of the descramble key may be multiplexed on coded video data or audio data associated with that video data.
There are a case (1) that the descramble key itself is obtained by a path (IC card, internet and the like) different from that of the coded video data and a case (2) that in a transmission side, a part of construction elements of the key data is obtained by a path different from that of the coded video data and the remainder thereof is transmitted with being multiplexed with the coded video data, using the digital water marking, and in a receiver side the key is reconstructed by combining the part of the key data and the remainder thereof. In the cases, a part of the descramble key can be hidden in contents using the aforementioned digital watermarking technique. A part of the descramble key may be hidden in a video signal before or after scrambling using digital watermarking.
When the descramble key is hidden in the scrambled video signal, since descrambling is done after the key is detected, data can be descrambled without any delay time. When the key is hidden in a picture signal before scrambling, a key for the next scrambled picture is extracted from the descrambled picture signal to descramble that picture. In the former case, descramble key information disappears by descrambling, and any delay time between key detection and the descramble process can be minimized. Conversely, in the latter case, digital watermark information containing the descramble key remains even after descrambling and, for example, playback control information can be hidden in the picture signal together with the descramble key.
Note that the hidden descramble key must be that for a picture input after that key. These two schemes can be selectively used depending on applications.
A key for descrambling the scrambled video signal may be hidden in an audio signal associated with the video signal. Normally, a pair of video signal and audio signal are strictly synchronously played back. That is, even when the descramble key for a corresponding video signal is hidden in an audio signal, their relationship can be strictly saved, and the video signal can be normally descrambled.
In this manner, when some data of the descramble key are sent while being hidden in a video signal or audio signal, even when the scramble pattern, i.e., the descramble key is varied temporally, normal descrambling can be done without disturbing the relationship between the scramble pattern and descramble key. Also, by changing the key frequently in consecutive video signals, resilience against unauthorized attacks can be improved.
For example, of the descramble key, data which frequently change temporally may be hidden in a video or audio signal, and data fixed in units of, e.g., programs may be sent from a route such as an IC card, telephone line, or the like. When the descramble key is hidden using digital watermarking, a key that frequently changes temporally need not be set using blanking interval information of a video signal as a video signal or another signal route, and the interface between devices can be simplified.
A video descramble apparatus according to the present invention comprises a receiver unit for receiving coded video data coded and scrambled by a video scramble apparatus of any one the first to eighth aspects, a decoding unit for decoding the coded video data received by the receiver unit to obtain a video signal, a descramble unit for descrambling the video signal obtained by the decoding unit, and a descramble key extraction unit for extracting the descramble key from at least one of the coded video data received by the receiver unit, the video signal obtained by the decoding unit, the video signal output from the descramble unit, and audio data included in the coded video data, and the descramble unit descrambles the video signal obtained by the decoding unit using the descramble key extracted by the scramble key extraction unit.
In this manner, in the video scramble apparatus of the present invention, since a descramble key is detected, encoded video data is decoded, and the decoded data is descrambled on the basis of the scramble pattern determined by the detected descramble key to output a video signal, a normal video signal can be played back.
Even when an unauthorized receiver decodes coded video data, he or she can only obtain a scrambled video signal, thus protecting the copyright of the corresponding contents.
Furthermore, according to the present invention, there is provided a recording medium that records video data, which is coded and scrambled by one of the video scramble apparatuses according to the first to eighth aspects. Since video data coded and scrambled according to the present invention cannot be normally played back even if its unauthorized duplication can be made, the copyright can be protected even on the recording medium.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.