Streaming media environments present many challenges for the system designer. For instance, clients can have different display, power, communication, and computational capabilities. In addition, communication links can have different maximum bandwidths, quality levels, and time-varying characteristics. A successful data streaming system must be able to stream data to different types of clients over time-varying communication links, and this streaming must be performed in a scalable and secure manner. Scalability is needed to enable streaming to a multitude of clients with different device capabilities and security is important, particularly in wireless networks, to protect content from eavesdroppers.
In order to achieve scalability and efficiency in streaming environments, it is necessary to adapt, or transcode, the compressed data stream at intermediate network nodes. A transcoder takes a compressed data stream as the input, then processes it to produce another compressed data stream as the output. Exemplary transcoding operations include bit rate reduction, rate shaping, spatial down-sampling, frame rate reduction, and changing compression formats. Network transcoding can improve system scalability and efficiency, for example, by adapting the spatial resolution of a data stream for a particular client's display capabilities or by dynamically adjusting the bit rate of a data stream to match a wireless channel's time-varying characteristics.
By way of example, a streaming media video clip may be part of a presentation of a web page. Large and powerful desktop receivers on a large bandwidth connection may receive and decrypt a full resolution, full frame rate, video stream of high-definition television (HDTV) for instance. However, a wireless adjunct to the same network may only be able to connect wireless users at a much smaller bandwidth. Therefore, the stream must be converted to a smaller bandwidth signal in order to be carried. Transcoding can achieve this conversion.
While network transcoding facilitates scalability in data streaming systems, it also poses a serious threat to the security of the streaming system. This is because conventional transcoding operations performed on encrypted streams generally require decrypting the stream, transcoding the decrypted stream, and then re-encrypting the result. Because every transcoder must decrypt the stream, each network transcoding node presents a possible breach in the security of the entire system.
Furthermore, there are potential transcoding nodes in many extensive networks that, unfortunately, cannot be trusted. These untrusted nodes may be individual computers, client intranets at remote locations, or any other node that is interposed between an original sender and an intended receiver.
More specifically, in conventional media streaming approaches, for example, employing application-level encryption, media data is first encoded, or compressed, into a bitstream using inter-frame compression algorithms. The resulting bitstream can then be encrypted, and the resulting encrypted stream is packetized and transmitted over the network using a transport protocol such as unreliable datagram protocol (UDP).
It is noted here that, as used herein, terms such as “encode, decode, encoding, decoding, encoded, decoded, coding, decoding,” etc., refer to the compression or other encoding of data into forms suitable for transport over network carriers, whether those carriers are cable, optical fiber, wireless carrier or other types of network connection. As used herein, such terms as “Encrypt, decrypt, encrypting, decrypting, encryption, decryption,” etc., refer to cryptographic encoding that is used to protect the security of data from unauthorized recipients or to verify that the data received is exactly what was originally sent.
Prior art FIG. 1A is a block diagram, 100, which illustrates the order in which conventional application-level encryption is performed. In this example, Compression Encoding, 102, is followed by Encryption, 103, and Packetization, 104. Packetization is the combining of appropriate length segments of encrypted data into packets for transmission in a network.
Prior art FIG. 1B is a block diagram, 105, which illustrates the order in which conventional network-level encryption is performed. In this example, Compression Encoding, 106, is followed by Packetization, 107, and then Encryption, 108. Again, packetization places the encrypted data into packets for transmission in a network.
Prior art FIG. 1C illustrates a conventional transcoding process. If transcoding is required between the sender and the receiver of the stream, then the reassembly and decryption discussed in conjunction with FIG. 1B must take place. In FIG. 1C, a functional block diagram of a transcoding process is illustrated in which encrypted data must be transcoded for reasons discussed previously. In process 120, the media stream is decrypted at 122, transcoded at 124, then re-encrypted at 126. During the period in which the data is unencrypted, it is accessible to unauthorized reading or corruption at an insecure or untrusted node. Furthermore, every node that performs decryption requires the decryption key, increasing the number of places where the key can be compromised and increasing the vulnerability of the system. In conventional video stream transcoding, for example, the transcoder must be able to read the content of a packet to perform transcoding. Hence the decryption/re-encryption of FIG. 1C.
Although the above-listed discussion specifically mentions the shortcomings of prior art approaches with respect to the streaming data, such shortcomings are not limited solely to the streaming of video data. Instead, the problems of the prior art span various types of media including, but not limited to, audio-based data, image-based data, speech-based data, graphic data, web page-based data, and the like.
However, the shortcomings of the prior art are well illustrated with reference to frame-coded, video, streaming. It is noted here that a common standard for digital video compression and transmission is that of the Moving Picture Experts Group, commonly known as MPEG. MPEG uses the similarity between frames to create a sequence of I, B, and P frames. Only the I-frame contains all the compressed data necessary to produce a complete frame image. The B and P frames only contain information relating to changes since the last I frame. MPEG 1 and MPEG 2 are the primary modes of digital video in common use. MPEG 2 supports a much higher quality and data rate than MPEG 1 and is the format most commonly favored for video on demand, DVD, and is the format chosen for transmitting Digital Television. MPEG-4 is also gaining in popularity. In addition, H.261, H.263, and the emerging H.264 are important video compression standards.
In prior art FIG. 1D, the coded frames component of a compressed streamed video signal is illustrated. An exemplary video GOP, or Group of Pictures, is shown at 120. GOP 120, contains an I-frame, 121, and several B-frames, 122, 124, 126 and 128, and P-frames, 123, 125 and 127. Another I-frame is shown at 129. It is noted that I-frames are illustrated larger than P-frames and B-frames are illustrated smaller. The relative size in the illustration is to show the relative compression available from each type of frame. An I-frame offers the least compression and a B-frame offers the most.
Accordingly, a method and/or system that can enable a potentially untrusted transcoder in a network to transcode a media stream while still preserving the end-to-end security of the rest of the stream would be valuable. Specifically, a means for transcoding, that allows a potentially untrusted transcoder to perform the transcoding in an appropriate manner, yet still allows the intended receiver to receive valid transmitted data and yet allows any encryption of the transmitted data to remain uncompromised, would be valuable.