In recent years, various kinds of multimedia equipment have been developed, and various kinds of multimedia software such as game software and education software have been marketed. However, under existing circumstances, the copyrights for these software are not satisfactorily protected, and a large number of illegally copied software appear on the market.
With the spread of the Internet in recent years, “Electronic Commerce” (EC) is increasingly being used by users of personal computers (PC), in which the users can obtain their favorite audio data or the like by downloading the data from homepages, and paying for the data by settlement means like credit cards.
The spread of network distribution by EC saves the users the trouble of going to record stores and, therefore, the network distribution by EC has a possibility of greatly changing the current distribution system of music, especially CD marketing.
By the way, audio data obtained by downloading as described above is recorded in a portable recording medium such as a CD-R (Compact Disk Recordable), whereby the user can listen to it many times.
Therefore, if the user only once obtains audio data in a PC, irrespective of whether the audio data is obtained through the internet or from a music CD on the market, the user can freely copy the audio data by using a CD-R. In other words, the copyright for the audio data stored in the PC cannot be effectively protected. Accordingly, in order to prevent audio data which has once been obtained by downloading from being copied and transferred to another user, i.e., to prevent illegal copying, it is very important to protect the copyright in the network distribution of the audio data.
Hereinafter, downloading and reproduction of audio data by using a PC will be described.
FIG. 11 is a block diagram illustrating the structure of a PC which performs downloading of audio data and reproduction of the downloaded audio data. The PC performs recording and reproduction of audio data which has been downloaded through a network, and the PC is hereinafter referred to as a “data recording and reproduction apparatus”.
The data recording and reproduction apparatus 1000 includes a recording medium 1002 in which a compressed audio data stream is recorded as the above-described audio data, a stream writing means 1001 for writing a compressed audio data stream ESau downloaded through a network 10a into the recording medium 1002, and a stream reading means 1003 for reading the compressed audio data stream ESau from the recording medium 1002.
Further, the data recording and reproduction apparatus 1000 includes a decoding means 1004 for decompressing, by decoding, the compressed audio data stream ESau output from the stream reading means 1003 so as to output a non-compressed data stream RSau; and a DA conversion means 1005 for performing digital-to-analog (DA) conversion on the non-compressed audio data stream RSau to output analog audio data Aau to a speaker 20.
In the data recording and reproduction apparatus 1000 so constructed, when the compressed audio data stream ESau is downloaded through the network 10a, the compressed audio data stream ESau is once written in the recording medium 1002, such as a hard disk, by the stream writing means 1001.
When the audio data is reproduced in the data recording and reproduction apparatus 1000, the compressed audio data stream ESau is read from the recording medium 1003 by the stream reading means 1003. Further, the compressed audio data stream ESau is decompressed by decoding in the decoding means 1004, whereby a non-compressed audio data stream RSau is restored therefor.
The non-compressed audio data stream RSau is converted to analog audio data Aau by the DA conversion means 1005 to be output to the speaker 20.
As described above, in the data recording and reproduction apparatus 1000 implemented by a PC, the audio data distributed on the network 10a can be easily and illegally copied by recording the compressed audio data stream Esau, which has been downloaded through the network 10a, in the recording medium 1002.
Meanwhile, MD (Mini Disc) players have become available as recording and reproduction apparatuses capable of recording digital audio data recorded on recording media such as CDs.
FIG. 12 is a block diagram illustrating the structure of an MD player.
This MD player 1100 includes a recording medium 1103 in which digital audio data is recorded, a coding means 1101 for compressing, by coding, a non-compressed audio data stream Sau read from a CD 10b to output a compressed audio data stream Esau, and a stream writing means 1102 for writing the compressed audio data stream ESau in the recording medium 1103 as the above-described digital audio data.
Further, the MD player 1100 includes a stream reading means 1104 for reading the compressed audio data stream ESau from the recording medium 1103, a decoding means 1105 for decompressing the read compressed audio data stream ESau by decoding to output a non-compressed audio data stream RSau, and a DA conversion means 1106 for performing DA conversion on the non-compressed audio data stream RSau to output analog audio data Aau to a speaker 20.
In the MD player 1100 so constructed, when the digital audio data (the non-compressed audio data stream) Sau obtained from the CD 10b is input, the non-compressed audio data stream Sau is compressed by coding in the coding means 1101 to be output as the compressed audio data stream ESau. The compressed audio data stream ESau is once written in the MD 1103 by the stream writing means 1102.
When the audio data is reproduced in the MD player 1100, the compressed audio data stream ESau is read from the recording medium 1103 by the stream reading means 1104, and the compressed audio data stream ESau is decompressed by decoding in the decoding means 1105 to be output as a non-compressed audio data stream RSau.
The non-compressed audio data stream RSau is converted to analog audio data Aau by the DA conversion means 1005 to be output to the speaker 20.
In the MD player as described above, the digital audio data recorded in the CD can be easily and illegally copied by digital-recording the audio data in the MD.
Furthermore, according to a recent trend in this technology, there is a demand for a recording and reproduction apparatus such as an MD player which is able to download audio data from home pages on the internet, and a data recording and reproduction apparatus meeting this demand has been developed.
FIG. 13 is a block diagram for explaining a data recording and reproduction apparatus which is able to obtain audio data from both a home page and a CD and to reproduce the obtained audio data.
This data recording and reproduction apparatus 1200 includes a recording medium 1204 containing a compressed audio data stream, and a stream attribute decision means 1201 for deciding whether the input audio data stream is compressed or not. Usually, the audio data stream downloaded through the network 10a is compressed while the audio data stream read from the CD 10b is not compressed.
Further, the data recording and reproduction apparatus 1200 includes a coding means 1202 for compressing, by coding, the non-compressed audio data stream Sau output from the stream attribute decision means 1201 to output a compressed audio data stream ESau; and a stream writing means 1203 for writing the compressed audio data stream ESau which is output from the coding means 1202 and the non-compressed audio data stream DSau which is output from the stream attribute decision means 1201 into the recording medium 1204.
Further, the data recording and reproduction apparatus 1200 includes a stream reading means 1205 for reading the compressed audio data stream ESau from the recording medium 1204, a decoding means 1206 for decompressing, by decoding, the read compressed audio data stream ESau to output a non-compressed audio data stream RSau, and a DA conversion means 1207 for performing DA conversion of the non-compressed audio data stream RSau to output analog audio data Aau to the speaker 20.
In the data recording and reproduction apparatus 1200 so constructed, when an audio data stream is input, the stream attribute decision means 1201 decides whether this audio data stream is compressed or not. According to the result of the decision, the audio data stream is output to one of the stream writing means 1203 and the coding means 1202. For example, when the compressed audio data stream ESau is input through the network 10a, this compressed audio data stream ESau is output to the stream writing means 1203 according to the decision of the stream attribute decision means 1201. On the other hand, when the non-compressed audio data stream Sau obtained from the CD 10b is input, this non-compressed audio data stream Sau is output to the coding means 1202 according to the decision of the stream attribute decision means 1201.
The compressed audio data stream ESau is once written in the recording medium 1204 by the stream writing means 1203.
When the audio data is reproduced in the data recording and reproduction apparatus 1200, the compressed audio data stream ESau is read from the recording medium 1204 by the stream reading means 1205. This compressed audio data stream ESau is decompressed by decoding in the decoding means 1206 to be output as a non-compressed audio data stream RSau.
Thereafter, the non-compressed audio data stream RSau is converted to analog audio data Aau by the DA conversion means 1207 to be output to the speaker 20.
In the data recording and reproduction apparatus shown in FIG. 13, illegal copying of audio data for which the copyright is to be protected becomes easier and, therefore, preventing such illegal copying of audio data is of greater importance.
By the way, as a countermeasure against illegal copying of audio data, there is a method of inserting a watermark in audio data for which the copyright is to be protected.
This watermark is inserted in digital audio data. Further, the watermark-inserted digital audio data is converted to analog audio data. Regardless of whether the analog audio data obtained by DA conversion of the watermark-inserted digital audio data remains as it is or whether the analog audio data is converted to digital data, the watermark can be extracted from the analog data or the digital data.
Hereinafter, a description will be given of the general principle of watermark insertion and extraction.
Initially, the outline of a process of inserting a watermark in digital audio data will be described.
FIG. 14 is a diagram conceptually illustrating an insertion and extraction of a watermark in/from digital audio data.
With reference to FIG. 14, signature data (watermark) is inserted in digital audio data recorded as an audio data file ODau (signature data insertion step Pad), and then the digital audio data in which the signature data is inserted is recorded as a signature-data-inserted audio data file SDau.
The signature data Dwmx inserted in the digital audio data is extracted in accordance with the digital audio data recorded as the audio data file ODau and the digital audio data recorded as the signature-data-inserted audio data file SDau.
FIG. 15 is a flowchart of the watermark insertion process.
Initially, digital audio data is subjected to blocking (step S1). This process is to divide the digital audio data into a plurality of data groups (blocks) each comprising a predetermined number of sampling data as a matter of convenience in the subsequent process.
Next, each block is subjected to the Fourier transform (step S2). The arithmetic operation for the Fourier transform will be described later in detail.
Thereafter, the following data transform is carried out as the watermark insertion process.
The watermark is composed of multiple bits of digital data (signature data), and each bit of the signature data corresponds to each block.
Initially, it is confirmed that the value of each bit as a component of a bit string (block string) of the signature data is “0” or “1” (step S3). A block corresponding to a bit of “0” is not subjected to watermarking. On the other hand, a block corresponding to a bit of “1” is subjected to watermarking, wherein an imaginary number part and a real number part of Fourier transform coefficients of audio data corresponding to this block are replaced with each other, and the real number part is multiplied with −1 (step S4). This process is performed for each block corresponding to a bit of “1”.
Then, each block, irrespective of whether the block corresponds to “0” or “1” is subjected to the inverse Fourier transform (step S5). Thereby, audio data of each block is restored. The inverse Fourier transform will be described later in more detail.
Through the above-described processes, a watermark which is inaudible to a normal human ear is inserted in the audio data.
Hereinafter, the respective processes will be described in more detail.
Initially, the Fourier transform and the inverse Fourier transform will be briefly described. The Fourier transform employed in the process of embedding a watermark (information to be embedded) is called “discrete Fourier transform” and is defined as follows.
When a discrete one-dimensional real number function f(n) (nεZ,0≦n≦N) is given, a function obtained by performing the discrete Fourier transform on the discrete one-dimensional real number function f(n) is defined by a discrete one-dimensional complex number function F(k) (kεZ,0≦k<N) which is given by formula (1) identified below.
Here, Z denotes the set of whole integers. Further, formula (1) satisfies the conditions given by formulae (2) and (3).                               F          ⁡                      (            k            )                          =                              ∑                          n              =              0                                      N              -              1                                ⁢                                    f              ⁡                              (                n                )                                      ⁢                          W              N                              -                kn                                      ⁢                                                  ⁢                          (                                                k                  =                  0                                ,                1                ,                                                      ⋯                    ⁢                                                                                  ⁢                    N                                    -                  1                                            )                                                          (        1        )            j2=−1  (2)WN=ej2π/N=cos(2π/N)+j sin(2π/N)  (3)
Further, the inverse discrete Fourier transform will be described hereinafter.
When a discrete one-dimensional real number function f(n) (nεZ,0≦n<N) is given and a discrete one-dimensional complex number function F(k) (kεZ,0≦k<N) is a function obtained by performing the discrete Fourier transform on the f(n), the following formula (4) holds.
Here, Z denotes the set of whole integers. Further, formula (4) satisfies the conditions given by formulae (5) and (6).                               F          ⁡                      (            k            )                          =                              ∑                          n              =              0                                      N              -              1                                ⁢                                    f              ⁡                              (                n                )                                      ⁢                          W              N                              -                kn                                      ⁢                                                  ⁢                          (                                                k                  =                  0                                ,                1                ,                                                      …                    ⁢                                                                                  ⁢                    N                                    -                  1                                            )                                                          (        1        )            j2=−1  (5)WN=ej2π/N=cos(2π/N)+j sin(2π/N)  (6)
Next, the watermark embedding process for audio data will be described more specifically.
First of all, blocking of audio data will be described with reference to FIG. 16.
Blocking is a process to represent sample values Sound(i) of digital audio data in which a watermark is to be embedded (hereinafter, referred to as target audio data) as a set of blocks each comprising samples which may number as many as the n-th power of 2 (2n). Here, it is assumed that the total number of blocks obtained by blocking the target audio data is (t+1), the first block is block B0, the k-th block (K is an arbitrary number) is block Bk, and the last block is block Bt. Further, the sample values of the k-th block are represented by Bk(j).
The relationship between the sample values Sound(i) of the target audio data and the respective sample values Bk(j) in the block is represented by the following formula (7).Bk(j)=Sound(i)  (7)where Z denotes the set of whole integers, k and j satisfy k,jεZ, and i satisfies i=2nk+j(0:≦j<2n).
Needless to say, the variables n and k used here are different from the variables n and k used in formula (1), which defines the general discrete one-dimensional Fourier transform, and in formula (4), which defines the discrete one-dimensional inverse Fourier transform.
Next, the watermark embedding process will be described.
Initially, the audio data (sample values) Bk(j) of the k-th block Bk are subjected to the discrete Fourier transform so as to obtain data Fk(m). Here, k is a variable indicating an arbitrary block amongst the blocks Bo−Bt, and it satisfies kεZ, kε[0,t(total block number)].
Further, a data bit string to be inserted is defined by a one-dimensional discrete integral number function U(d), and data which is obtained by embedding information in the data Fk(m)(mεZ,mε[1,2n]) according to the value of each bit in the data bit string defined by the function U(d) is represented by F′k(m).
Here, d and dn satisfy the condition (d,dnεZ). When dn satisfies dn<2n-1, U(d) is 1 or 0. When d does not satisfy dε[1,dn]), U(d) is 0.
Then, F′k(m) is represented by the following formulae (8)–(15), wherein m satisfies mεZ, mε[1,2n].Re(F′k(m))=−Im(Fk(m))(when U(m)=1)  (8)Re(F′k(m))=Re(Fk(m))(when U(m)=0)  (9)Im(F′k(m))=Re(Fk(m))(when U(m)=1)  (10)Im(F′k(m))=Im(Fk(m))(when U(m)=0)  (1)Re(F′k(2n−m+1))=Im(Fk(m))(when U(m)=1)  (12)Re(F′k(2n−m+1))=Re(Fk(m))(when U(m)=0)  (13)Im(F′k(2n−m+1))=Re(Fk(m))(when U(m)=1)  (14)Im(F′k(2n−m+1))=Im(Fk(m))(when U(m)=0)  (15)
The above-described formulae (8)–(11) are applied to the low-frequency components amongst the 2n pieces of data (frequency components) Fk(m) obtained by subjecting the 2n pieces of data (sample values) Bk(j) to the discrete Fourier transform. On the other hand, the above-described formulae (12)–(15) are applied to the high-frequency components of the 2n pieces of data (frequency components) Fk(m) obtained by subjecting the 2n pieces of data (sample values) Bk(j) to the discrete Fourier transform.
Further, as represented by formulae (9), (11), (13) and (15), a block corresponding to a bit of 0 in the signature data bit string is not subjected to the watermark embedding process. On the other hand, as represented by formulae (8), (10), (12) and (14), a block corresponding to a bit of 1 in the signature data bit string is subjected to the watermark embedding process, in which the imaginary number part and the real number part of the data Fk(m) obtained by the Fourier transform of the audio data (sample values) Bk(j) corresponding to this block are replaced with each other, and the real number part is multiplied with −1.
Further, the watermark embedding process is performed on pairs of the Fourier transformed data on the low-frequency side and the corresponding Fourier transformed data on the high-frequency side so that the target audio data in which information is embedded is not offensive to the ear of the listener. Here, the m-th Fourier transformed data F′k(m) which has been subjected to the watermark embedding process corresponds to the (2n−m+1)th Fourier transformed data F′k(2n−m+1) which has been subjected to the watermark embedding process.
Next, the watermark extraction process will be described.
FIG. 17 is a flowchart of the watermark extraction process.
Initially, audio data which has been subjected to the watermark embedding process is divided into plural blocks (step S11a), and each of the plural blocks of the audio data subjected to the watermark embedding process is subjected to the Fourier transform (step S12a). Further, audio data which has not been subjected to the watermark embedding process is divided into plural blocks (step S11b), and of the plural blocks of the audio data which has not been subjected to the watermark embedding process each is subjected to the Fourier transform (step S12b).
Then, the data obtained as the results of the above-described Fourier transform steps are compared, block by block, between the blocks of the audio data which have been subjected to the watermark embedding process and the corresponding blocks of the audio data which have not been subjected to the watermark embedding process (step S13).
As a result of the comparison, when the data of the corresponding blocks are the same as each other, it is decided that no watermark is embedded in the block which has been subjected to the watermark embedding process, and the signature data bit is 0 (step S14). However, when the data of the corresponding blocks are different from each other, it is decided that a watermark is embedded in the block which has been subjected to the watermark embedding process, and the signature data bit is 1 (Step S15).
This process is repeated block by block to extract the bit string (embedded information) constituting the signature data.
Next, the watermark embedding process and the watermark extracting process will be described more specifically.
Initially, the process of embedding a watermark in each block will be described.
In the following description, Sound(n) represents audio data (sample values) in one block in which signature data is to be embedded, and syomei[u] represents a signature data bit string to be embedded in data which is obtained by subjecting the audio data Sound(n) in one block to the Fourier transform. Further, F[Sound](p) represents data obtained by subjecting the target audio data Sound(n) to the discrete Fourier transform, and F[Sound](p) represents data obtained by embedding the signature data bit string in the F[Sound](p).
Here, the audio data Sound(n) is a function defined in the integral space and has an integer as its value, where n=0,1, . . . ,N.
Further, the signature data bit string syomei[u] is also a function defined in the integral space (refer to formula (16)) but has only 0 or 1 as its value, where u=0,1.syomei[u]={1,0}  (16)
When the audio data Sound(n) is subjected to the Fourier transform, the corresponding Fourier transformed data F[Sound](p) is obtained as follows.                                           F            ⁡                          [              Sound              ]                                ⁢                      (            p            )                          =                              ∑                          n              =              0                        N                    ⁢                                    Sound              ⁡                              (                n                )                                      ⁢                          ⅇ                              j2π                ⁢                                                                  ⁢                                  pn                  /                  N                                                                                        (        17        )            
This F[Sound](p) is a function defined in the integral space and has a complex number as its value, where p=0,1, . . . N.
Assuming that the real number part of the Fourier transformed data F[Sound](p), which is a complex number, is Re{F[Sound](p)} while the imaginary number part thereof is Im{F[Sound](p)}, the data F′[Sound](p) can be represented by using the above-described formulae (8)–(15) in accordance with the value of the signature data bit string syomei[u].
Assuming that the signature data bit string to be embedded in the Fourier transformed data F[Sound](p) corresponding to one block is syomei[0]=1, the first bit value F[Sound](1) of the Fourier transformed data F[Sound](p) and the N-th bit value F[Sound](N) thereof are subjected to the information embedding process by using the above-described formulae (8), (10), (12) and (14).
The following formulae (18)–(21) represent the Fourier transformed data F′[Sound](1) and F′[Sound](N) obtained in the watermark embedding process.Re{F′[Sound](1)}=−Im{F[Sound](1)}  (18)Im{F′[Sound](1)}=Re{F[Sound](1)}  (19)Re{F′[Sound](N)}=−Im{F[Sound](N)}  (20)Im{F′[Sound](N)}=Re{F[Sound](N)}  (21)where Re and Im indicate the real number part and the imaginary number part of the complex number in { }, respectively.
On the other hand, assuming that the signature data bit string to be embedded in the audio data Sound(n) corresponding to one block is syomei[1]=0, the second bit value F[Sound](2) of the Fourier transformed data F[Sound](p) and the (N−1)th bit value F[Sound](N−1) thereof are subjected to the watermark embedding process by using the above-described formulae (9), (11), (13) and (15).
The following formulae (22)–(25) represent the Fourier transformed data F′[Sound](2) and F′[Sound](N−1) obtained in the watermark embedding process.Re{F′[Sound](2)}=Re{F[Sound](2)}  (22)Im{F′[Sound](2)}=Im{F[Sound](2)}  (23)Re{F′[Sound](N−1)}=Re{F[Sound](N−1)}  (24)Im{F′[Sound](N−1)}=Im{F[Sound](N−1)}  (25)
By performing the inverse discrete Fourier transform on the data F′[Sound](p) which has been obtained by subjecting the Fourier transformed data F[Sound](p) corresponding to the audio data Sound(n) in one block to the watermark embedding process by using the above-described formulae (8)–(15), watermark-embedded audio data Sound′(n) is obtained as represented by the following formula (26).                                           Sound            ′                    ⁡                      (            n            )                          =                              ∑                          n              =              0                        N                    ⁢                                                    F                ′                            [              Sound              ]                        ⁢                          (              p              )                        ⁢                          ⅇ                              j2π                ⁢                                                                  ⁢                                  pn                  /                  N                                                                                        (        26        )            
Next, the watermark extraction process will be described briefly.
In the watermark extraction process, the Sound(n) and the Sound′(n) are respectively subjected to the Fourier transform, and the respective Fourier-transformed data are compared. When the values of these data are different from each other, the signature data bit string is extracted with the signature bit data being 1. When the values of these data are identical, the signature data bit string is extracted with the signature bit data being 0.
The algorithm for the watermark extraction process will now be briefly described.
In { }, n moves sequentially from 1 to N.    {            If F[Sound](n)=F[Sound′](n) does not hold, syomei[n−1]=0        If F[Sound](n)=F[Sound′](n) holds, syomei[n−1]=0            }
By the way, in the existing reproduction apparatuses such as MD players which do not detect watermarks, audio data streams in which watermarks are embedded as described above can be recorded and reproduced similar to the way audio data streams in which no watermark is embedded can be recorded and reproduced.
If reproduction apparatuses such as MD players which will be manufactured in the future are only those which do not record audio data streams in which watermarks of “copy inhibit” are embedded but which also can record audio data streams in which no watermark of “copy inhibit” is embedded, it is possible to restrict illegal copying to the audio data streams by watermarking in the future.
However, since watermarks are embedded in non-compressed audio data streams, it is difficult to simply apply the above-described watermark embedding process to the data recording and reproduction apparatus 1200 which receives compressed audio data streams from the homepage and non-compressed audio data streams from the CD.
Consequently, in the data recording and reproduction apparatus 1200 which can obtain audio data from both the homepage and the CD and reproduce the audio data, illegal copying of the audio data cannot be effectively prevented by using watermarking.