In present days, the transmission of speech data and/or picture data, collectively referred to as contents data, over the Internet, is made in accordance with a downloading transmission system, or a streaming transmission system, sometimes abbreviated herein to streaming. In the downloading transmission system, a contents data file, transmitted from the distributing server, is transiently copied by a client terminal (device) and subsequently the contents data, such as pictures, of the contents data file, are reproduced. Hence, with the downloading transmission system, data cannot be reproduced on a terminal side until the file transmission comes to a close, such that this downloading transmission system is not convenient for reproducing e.g. picture data for prolonged time.
On the other hand, with streaming, the client terminal sequentially reproduces digital signals in real-time as the terminal receives a continuous stream of the digital signals. That is, the data received by the client terminal is reproduced even during the time the digital signals are being transmitted from a stream server to the client terminal. This streaming predominantly uses a protocol termed RTP (Real-Time Transport Protocol), which is real-time transmission protocol prescribed in RFC 1889 of IETF (International Engineering Task Force).
The streaming system ordinarily comprises an authoring device, a stream server and a client terminal. The authoring device is supplied with encrypted contents data, obtained by encoding contents data supplied from picture inputting means, such as a camera or a VTR (video tape recorder), and which has been encrypted by a so-called contents data key. The authoring device then formulates, from the encrypted contents data, the data for stream distribution, as prescribed by the distribution system. The data for stream distribution is composed of, for example, the encrypted contents data, added by the header information usable for adding a header stating the information pertinent to the entire contents data, and a track for each data sort, to the contents data, and by the ancillary information, including a session describing protocol (SDP) file, prepared in accordance with the session description protocol SDP (RFC237), and the packetizing control information. The data for stream distribution, thus prepared, is recorded on e.g. a recording unit of a recording medium, such as an optical disc, owned by the authoring device. If a request is made from the client terminal for employing the contents data, this stream distribution data is optionally taken out from the recording unit and sent to the streaming server, from which the session description protocol (SDP) file is transmitted to the client terminal. The client terminal then acquires, from the session description protocol (SDP) file, the information needed for receiving the stream, such as an address, a port number or a packet format. The streaming server then packetizes the data for stream distribution, in accordance with the packetizing control information, and distributes the data (stream), from packet to packet, to a client terminal, such as personal computer (PC), for real-time reproduction, in accordance with RTP and RTSP (Real-Time Streaming Protocol).
In connection with this stream distribution, there exist a large variety of the distribution systems used, stream compression systems and transmission protocols. Hence, the International Streaming Media Alliance, abbreviated herein to ISMA, prescribing the open standard for stream distribution over the Internet of rich media (video, audio and relevant data), has been organized, and the procedure for adopting the open standard is now underway. An example of the open standard is hereinafter explained.
FIG. 1 depicts an example of a conventional streaming system. Referring to FIG. 1, contents data are packetized and distributed as a stream from a server 101 to a client terminal 102. The ISMA is an organization which prescribes the open standard for e.g. the code compression system or the packetizing system of the stream data at this time. For example, it is prescribed that the RTSP protocol, provided for by the IETF, shall be used for session management, and that the RTP protocol, provided for by the IETF, shall be used for media transmission (stream distribution). The open standard for encryption and packetizing is also prescribed.
For distribution of the picture or the voice as digital data by such system, copyright protection is crucial. The digital data is not deteriorated by duplication, so that, if the copied data is such as may be directly viewed or heard, a large quantity of copies may be produced, thus detracting from the commercial value of the contents data. Consequently, the general practice is to take protective measures for the digital picture and speech data by e.g. encryption, in accordance with the digital copyright management or digital right management (DRM) system, such that, even if data can be copied, data reproduction is not possible in the absence of a decoding key designed for decoding the encrypted data.
FIG. 2 depicts a block diagram showing a streaming system for illustrating the conventional DRM. The streaming system is configured as shown for example in FIG. 7. That is, the streaming system includes a provider 111 for providing contents data D11, such as picture data and/or speech data, an authoring (master) 112 for encrypting the contents data D11 for formulating encrypted contents data, and for generating the ancillary information, such as the packetizing control information, for managing control for generating a packet of a preset format, and a streaming server 113 for receiving data D12, inclusive of the encrypted contents data from the authoring 112, and the ancillary information, inclusive of the packetizing control information, packetizing the encrypted contents data based on the packetizing control information in accordance with the packetizing control information, and for distributing the data as stream data D15. The streaming system also includes a license management server 114 supplied from the provider 111 with data D13 pertinent to the rights of the contents data D11 to perform copyright management of the contents data D11, and a client terminal 115 which, when supplied with license D14, such as viewing rights or use conditions and with a cipher key pertaining to the contents data D11, from the license management server 114, is able to view the received stream data D15.
By allowing only the licensed client terminal 115 to decrypt the encoded stream data into the plaintext, it is possible to protect the rights of the provider as a contents data furnishing party.
The RTP packet in case of distributing the stream data using the RTP protocol is hereinafter explained. The RFC 3016 provides that video or audio data of a preset unit shall be transmitted as one RTP packet. FIGS. 3A and 3B are schematic views showing an illustrative structure of the RTP packet. Referring to FIG. 3A, a packet 200a, distributed as a stream, is made up by an RTP header 201 and an RTP payload 202, and may also include a tag 203, in case of employing an SRTP protocol.
The RTP payload 202 includes media data (contents data) 205, in which to hold one or more video packet, and a cryptoheader (encrypted header) 204. In the cryptoheader 204, the encryption information, such as the encryption system, is written as ‘IV’, representing the information pertinent to the encrypted media data 205. The information ‘E’ indicating whether or not the data in question has been encrypted, the information ‘F’ as to whether or not the cryptoheader 204 is followed by another cryptoheader, or a key index (KI), may also be stated as optional data. An encrypted media header 206 may also be provided, along with the cryptoheader 204, in the RTP payload 202, as in a packet 200b shown in FIG. 3B. The media data 205 includes a plural number of data units, in each of which an audio frame, for example, is written in case the data is audio data and one-frame data, for example, is written in case the data is a moving picture. In the encrypted media header 206, there is written the information indicating the sequence of the data units in the media data 205, specifically, the serial number in the entire media data.
The media data is initially encrypted and subsequently packetized. The cryptoheader 204, shown in FIG. 3A, and the cryptoheader 204 as well as the encrypted media header 206, shown in FIG. 3B, depend solely on the encryption system, without dependency upon the encoding system. These headers are appended to the encrypted media data 205 as unencrypted plaintext.
In such streaming technique, the encryption method, aimed at high-speed safe encryption, is stated in the Japanese Patent Application Laid-Open No. 2002-111625. In this technique, shown in the Japanese Patent Application Laid-Open No. 2002-111625, an encrypted non-open key is inserted in a stream header of a stream, composed of the stream header and a packet, and the packet, only a data portion of which has been encrypted, is sent to a client by a packet key generated by the non-open key. An open key is handed to the client, who extracts the non-open key by the open key and also generates a packet key by the non-open key to decode the encrypted key from the packet key.
In the above-described system, in which the media data (contents data) is encrypted and subsequently packetized, the media data is already encrypted when the data is to be packetized, so that the information on the encoding system for the media data cannot be obtained. That is, in a streaming server in which data is simply packetized, based on the packetizing control information, it is not clear whether the media data is speech data or video (picture) data or which is the encoding system used, and hence it is difficult to perform the processing peculiar to the encoding system. For example, there are occasions where, depending on the particular encoding system used, there exist data which is crucial and data which is not crucial. However, in such case, the crucial data cannot be distributed a plural number of times to effect reliable distribution of the contents data for taking so-called error resilience into account. On the other hand, the RFC provides for addition of the information dependent on the encoding system (codec dependent header). Thus, with a client terminal designed to receive a packet not added by the codec dependent header provided for in RFC, that is, a packet of the type different from data formed in accordance with the RFC standard, the standard has to be expanded in order to cope with the particular packet.
It may be envisaged to packetize data before encryption in order to allow the processing peculiar to the encoding system. FIG. 4 schematically shows the structure of a packet in which media data not as yet encrypted is packetized. If the media data is packetized and then encrypted, the media data is packetized, the codec dependent header, dependent on the encoding of the media data, is then added to the packetized media data. The codec dependent header and the media data, both packetized, are encrypted, and a cryptoheader is added to the resulting data.
That is, referring to FIG. 4, a UDP payload 210 is made up by an RTP header 211, an RTP payload 212 and a tag 213 appended as necessary. The RTP payload 212 is made up by a cryptoheader 214, a codec dependent header (codec dependency header) 217 and media data 215. The RTP header 211 and the cryptoheader 214, are non-encrypted plaintext, as aforesaid, while the codec dependent header 217 and the media data 215 are already encrypted.
However, if the media data is first packetized and subsequently encrypted, the following inconvenience arises, even though the codec dependent header can be appended. That is, the streaming server usually distributes data in a packetized form, and not has the rights to view the media data (contents data). Thus, if the non-encrypted media data or the packetized non-encrypted media data is supplied to the streaming server, which streaming server then performs processing, such as encryption, the non-encrypted data is exposed to the streaming server in a manner not desirable from the perspective of DMA.
Additionally, the client terminal has to perform the processing of decryption, reassembling and decoding of the received packet. FIG. 5 schematically shows the processing sequence at the client terminal which has received the packet shown in FIG. 4.
Referring to FIG. 5, the client terminal first decrypts the encrypted codec dependent data and the encrypted media data, in a decryptor 501, and links the media data together in a re-assembler 502 in an original sequence prior to packetizing. The media data may then be viewed by decoding the encoded media data to the original media data by e.g. a decoder 503.
Since the client terminal in this case decrypts the encrypted contents data immediately after receiving a packet, the client terminal handles non-encrypted data since the time the client terminal re-assembles and decodes the data until the time it is viewed. Hence, the contents data may illicitly be acquired by e.g. cracking the processing modules, thus lowering tamper-proofness. If the processing modules are rendered tamper-proof, in their entirety, the result is the increased cost.