Streaming media environments present many challenges for the system designer. For instance, clients can have different display, power, communication, and computational capabilities. In addition, communication links can have different maximum bandwidths, quality levels, and time-varying characteristics. A successful video streaming system must be able to stream video to different types of clients over time-varying communication links, and this streaming must be performed in a scalable and secure manner. Scalability is needed to enable streaming to a multitude of clients with different device capabilities and security is important, particularly in wireless networks, to protect content from eavesdroppers.
In order to achieve scalability and efficiency in wireless streaming environments, it is necessary to adapt, or transcode, the compressed video stream at intermediate network nodes. A transcoder takes a compressed video stream as the input, then processes it to produce another compressed video stream as the output. Exemplary transcoding operations include bit rate reduction, rate shaping, spatial down-sampling, frame rate reduction, and changing compression formats. Network transcoding can improve system scalability and efficiency, for example, by adapting the spatial resolution of a video stream for a particular client's display capabilities or by dynamically adjusting the bit rate of a video stream to match a wireless channel's time-varying characteristics.
By way of example, a streaming media video clip may be part of a presentation of a web page. Large and powerful desktop receivers on a large bandwidth connection may receive and decrypt a full resolution, full frame rate, video stream of high-definition television (HDTV) for instance. However, a wireless adjunct to the same network may only be able to connect wireless users at a much smaller bandwidth. The stream must be converted to a smaller bandwidth signal in order to be carried. The conversion is called transcoding.
While network transcoding facilitates scalability in video streaming systems, it also poses a serious threat to the security of the streaming system. This is because conventional transcoding operations performed on encrypted streams generally require decrypting the stream, transcoding the decrypted stream, and then re-encrypting the result. Specifically, the transcoder requires the encryption key and the content is decrypted, and in plain form, at the transcoder. Because every transcoder must decrypt the stream, each network transcoding node presents a possible breach in the security of the entire system.
Furthermore, there may be strategically placed nodes I network that are ideally located for performing transcoding but cannot be trusted. These untrusted nodes may be individual computers, client intranets at remote locations, or any other node that is interposed between an original sender and an intended receiver.
More specifically, in conventional video streaming approaches, for example, employing application-level encryption, video is first encoded, or compressed, into a bitstream using inter-frame compression algorithms. The resulting bitstream can then be encrypted, and the resulting encrypted stream is packetized and transmitted over the network using a transport protocol such as unreliable datagram protocol (UDP).
It is noted here that, in this discussion of background, the use of the terms “encode, decode, encoding, decoding, encoded, decoded,” etc. refer to the compression or other encoding of data into forms suitable for transport over network carriers, whether those carriers are cable, optical fiber, wireless carrier or other network connection. “Encrypt, decrypt, encrypting, decrypting, encryption, decryption,” etc. refer to cryptographic encoding that is used to protect the security of data from unauthorized recipients or to verify that the data received is exactly what was originally sent.
Prior art FIG. 1A is a block diagram, 100, which illustrates the order in which conventional application-level encryption is performed (i.e. Compression Encoding, 102, Encryption/Checksum Computation, 104 and Packetization, 106). One difficulty with this conventional approach arises when a packet is lost. Specifically, error recovery is difficult because without the data from the lost packet, decryption and/or decoding may be difficult if not impossible.
Prior art FIG. 1B illustrates the resultant packetized media stream as produced by process 100 of FIG. 1A. Media stream 111 is compressed by compression encoding function 102, encrypted and cryptographic checksum (CCS) 112 is appended by Encryption/CCS function 104. Packetization, 106, separates the signal, consisting of the media stream data and CCS, into packets of the network's required size, 113. All of the packets must be reassembled into the encrypted media stream in order to decrypt the data, or verify it if it is unencrypted. If one of the packets, 113, is lost, then the entire message is lost due to the invalidity of the CCS without the missing packet.
It is noted here that encryption and CCS computation are related but not the same operation. A CCS can be computed and appended to an unencrypted media stream and the CCS can be used to verify the integrity and authenticity of the stream at the receiver.
Prior art FIG. 1C illustrates a functional block diagram of a transcoding process in which encrypted data must be transcoded for reasons discussed previously. In process 120, the media stream is decrypted at 122, transcoded at 124, then re-encrypted at 126. During the period in which the data is unencrypted, it is accessible to unauthorized reading or manipulation at an insecure or untrusted node.
In hybrid wired/wireless networks, it is often necessary to simultaneously stream media to fixed clients on a wired network and to mobile clients on a wireless network. In such a hybrid system, it may often be desirable to send a full-bandwidth, high-resolution media stream to the fixed, wired, client, and a lower-bandwidth, medium-resolution media stream to the mobile wireless receiver. Conventional media streaming approaches, however, do not achieve the efficiency, security, and scalability necessary to readily accommodate the streaming corresponding to hybrid wired/wireless networks.
Although the above-listed discussion specifically mentions the shortcomings of prior art approaches with respect to the streaming data, such shortcomings are not limited solely to the streaming of video data. Instead, the problems of the prior art span various types of media including, but not limited to, audio-based data, speech-based data, image-based data, graphic data, web page-based data, and the like.
Accordingly, what is needed is a method and/or system that can enable a potentially untrusted transcoder in the middle of a network to transcode a media stream while still preserving the end-to-end security of the rest of the stream. Specifically, what is needed is a means for computing and performing the cryptographic checksum that allows a potentially untrusted transcoder to perform the transcoding in an appropriate manner, yet still allowing the intended receiver to validate the integrity of the transmitted data and allowing any encryption of the transmitted data to remain uncompromised.