1. Field of the Invention
The present invention generally relates to transmission of data over a network and, more specifically, to an apparatus, system and method for selectively encrypting and decrypting different portions of data sent over a network.
2. Description of Related Art
Networking, today exemplified by the Internet, began as a way for users to share text-based information and files. The technology has, however, advanced far beyond text. The Internet can be used to conduct videoconferences where the participants in the conference can see each other in real-time on their computer screens or network-connected conferencing device. Internet users can watch live video footage of events as they happen, or view pre-recorded programming “on demand,” i.e., whenever they choose rather than on a broadcast schedule.
To understand how video and multimedia content is transmitted over the Internet, first it is necessary to understand the concept of streaming. Streaming solves a long-standing problem inherent in transmitting media information across the Internet: multimedia files are quite large, and most users have relatively low-bandwidth network connections. Sending broadcast-quality video or CD-quality audio across the Internet has never been easy or practical. It could take hours to send a single audio or video file across the network to someone's computer. The person at the other end would have to wait until the entire file was downloaded before playback could begin. Playback might last only a few minutes.
It has long been possible to reduce the amount of data required to transmit multimedia material. One way is to reduce the resolution, frame rate or sampling rate of the transmitted data, which has the side effect of reducing the perceived quality of the audio or video content. Another commonly used method is to apply data compression to the data to remove redundancies and retain only the most essential data from a multimedia file. By combining these two techniques, it is possible to produce multimedia files of reasonable quality that can be transmitted over typical Internet connections in time frames similar to the running time of the media content itself, i.e. minutes instead of hours. The actual quality of downloaded multimedia material varies depending on the available network bandwidth and the willingness of the user to wait for it to arrive; it is possible to trade off quality against file size and to prepare higher-quality versions of the content for users with more bandwidth.
When the amount of data per second of multimedia content is equal to or less than the amount of bandwidth available to a network user, streaming can occur. Streaming is essentially “just-in-time” delivery of multimedia data. The user does not need to wait for the entire file to arrive before s/he can begin viewing it, nor does s/he need to dedicate a large piece of his/her computer's storage to the received data. As it arrives at the client computer, the multimedia data is typically stored in a buffer capable of holding several seconds' worth of data, in order to avoid interruptions in playback caused by erratic network connections. When the buffer is initially filled, which happens in seconds rather than minutes or hours, playback begins, and individual frames of video and/or audio segments are retrieved from the buffer, decoded, and displayed on the client's video monitor, played through the client's sound system, or otherwise played back via the client's network connected device. After the data has been played, it is in many cases discarded and must be transmitted again if the user wishes to review a previous portion of the program.
The World Wide Web (“Web”) operates on a client/server model; that is, a user (i.e. a client) runs a piece of software on his/her personal computer or other network device (such as a network appliance or Internet capable wireless phone) capable of accessing the resources of a network server. The server can allow many different users to access its resources at the same time and need not be dedicated to providing resources to a single user. In this model, the client software—e.g. a browser or player—runs on the user's computer. When a user requests a video or other multimedia content on the Web, his/her client software contacts the Web server containing the desired information or resources and sends a request message. The Web server locates and sends the requested information to the browser or player, which displays the results by interpreting the received data as appropriate, e.g. displaying it as video on the computer's display.
Information on the Web is addressed by means of Uniform Resource Locators (URLs). A URL specifies the protocol to be used to access the required information (commonly HTTP, the HyperText Transfer Protocol), the address or Internet name of the host containing the information, any authentication information required to gain access to the information (such as a User ID and Password), an optional TCP/IP port number if the resource is not available on the standard port for the specified protocol, and a path to the desired information in a virtual hierarchy designated by the server operators. Additionally, the URL can contain information entered by the user such as query text for a database search. URLs can be specified in many ways, including but not limited to typing them into a Web client, by clicking a “hyperlink” in some other Web document, by choosing a “bookmark” in a Web browser to return to a previously-visited page, or by filling out a Web form and submitting it to the server.
When a request is made for a particular URL, the request is sent to the appropriate server using the protocol indicated in the URL, such as HTTP, and the requested data (or an error message if the request cannot be fulfilled) is returned by the same method. However, HTTP and other Web protocols are built on top of fundamental Internet protocols, such as TCP (Transmission Control Protocol) or UDP (User Datagram Protocol). The protocols used for streaming media are built on top of these fundamental protocols in the same way that other high-level protocols, such as HTTP, are. It is essential to realize that all Internet traffic is carried by a relatively small number of low-level, fundamental protocols such as TCP and UDP, which are used to establish the connection between computers that carries the higher-level protocol. When streaming multimedia content, then, the server may use either TCP or UDP.
RTP (Real-Time Protocol), RTSP (Real-Time Streaming Protocol), RTCP (Real-Time Control Protocol) are three of the many possible streaming media protocols that can be built on top of the Internet's low-level protocols. In fact, RTP, RTSP, and RTCP serve complimentary functions and are most often used together as part of some of the more common streaming implementations. RTSP is used to set up and manage a streaming connection to between a server and a client, RTP is used to deliver the actual multimedia data, and RTCP provides timing and other control signals between the client and the streaming server. Often, RTP is sent on top of UDP, a low-overhead protocol that delivers the data as quickly as possible, but does not guarantee that any particular piece of data (a “packet” in network terminology) will actually arrive at its destination. RTCP packets are sent interleaved with the data or in parallel with the RTP media channel via TCP.
One serious drawback of using the Internet for the distribution of streaming media is that the data of necessity often passes through networks and systems that are not controlled by the sender of the data en route to the client. Once the data leaves the sender's protected network, it is vulnerable to interception. This is of particular concern when proprietary data, such as a motion picture, is transmitted across the network, be it the public Internet or a third party private network. The data can potentially be intercepted and copied at any point as it is transmitted across these networks. Without some way to protect data from interception and unauthorized duplication, the Internet will never provide the security necessary to allow copyright owners to safely distribute their works over the network.
Encryption is one way to address this issue. Encryption provides a way to encode information such that only the intended recipient can view it. While anyone can intercept the data, only the legitimate recipient will be able to decrypt it, retrieve the original message or media content, and display it. Many encryption solutions have been created to provide this type of security. For example, software referred to as VPN (Virtual Private Network) allows a group of computers connected to the Internet to behave as if they were connected to a physically secure local network, using encryption to ensure that only computers on the VPN can access the private network resources. Other applications of encryption to allow private communications over the Internet include but are not limited to PGP (Pretty Good Privacy), IPSec (IP Security) and SSL (Secure Sockets Layer).
Organizations, often but not limited to corporations, also protect their proprietary data by use of firewalls. Every time a corporation connects its internal computer network or local area network (LAN) to the Internet, it faces a similar problem of private data being intercepted. Whereas encryption is used to ensure that data sent through the public Internet is usable only by intended recipients, firewalls are aimed at keeping proprietary information secure on a LAN which is connected to the Internet by preventing unauthorized users from accessing information stored on the internal network via the public Internet. Due to the Internet's public nature, every LAN connected to it is vulnerable to attack from the outside. Firewalls allow anyone on the protected LAN to access the Internet in ways allowed by the firewall configuration, while stopping hackers on the Internet from gaining access to the LAN and stealing information and/or deleting or otherwise vandalizing valuable data.
Firewalls are special-purpose devices built on routers, servers, and specialized software. One of the simplest kinds of firewalls uses packet filtering. In packet filtering, a router screens each packet of data traveling between the Internet and the LAN by examining its header. Every TCP/IP packet has a header containing the IP address of the sender and receiver as well as the port number of the connection and other information. By examining the header, and particularly the port number, the router can determine with a fair amount of accuracy the type of Internet service each packet is being used for. Each Internet service, including HTTP (the Web), FTP (File Transfer Protocol), Telnet, rlogin, and many others, has a standard port number that is used by convention for most access to the service. While any port number can be used for any service, it is far more convenient for users to use the standard port numbers, and virtually all Internet services intended for access by the public do so. Once the router knows what client and what service a packet is intended for, it can simply block outside access to hosts and services that public Internet users should not be able to use. System administrators set the rules for determining which packets should be allowed into the network and which should be blocked.
Proxy servers are commonly used in conjunction with firewalls. A proxy server is a software gateway that runs on a computer that is accessible by both the protected LAN and the Internet. All access to the Internet from the LAN must go through the proxy server, as must all access to the LAN from the public Internet. When a computer on the LAN requests an Internet resource such as a Web page, that request is sent to the proxy server. The proxy server then makes the request to the Internet resource and forwards any returned data back to the original requester on the LAN. Since the Internet and the LAN touch only at the single point of the proxy server, which acts as a go-between, protecting the network involves securing only one computer, rather than dozens or hundreds. Proxy servers are frequently used by employers to control and monitor how their employees use the Internet, as well as for preventing intrusion attempts from the Internet.
NAT, the term used for networks that utilize Network Address Translation, complicates delivery of Internet data as well. In a NAT, a privately addressed network is established separated from the Internet by a NAT router. This router in turn has an address on the public Internet. By translating the addresses of the private network into addresses recognizable on the public Internet and vice versa, the NAT router facilitates Internet connectivity for the privately networked machines. NATs are often used where limited numbers of public addresses are available in comparison to the number of users (a user might use a NAT for multiple machines sharing a cable modem connection) or in conjunction with firewalls and/or proxies to provide an extra layer of security to the private network.
Data packets can be typically divided into two parts, the header and the payload parts. The header is the portion of the packet that includes routings or other configuration information. The payload is the portion of the data packet that is just the data of interest, in exemplary case: multimedia content. For example, in a network packet, the header contains data for use by network routers in delivering the packet to its final destination, as well as other data about the packet such as size and formatting information. In an exemplary RTP packet, the header contains channel information as well as other information needed by the player to direct the RTP (media content) payload. Some complex packets may contain multiple headers and diverse non-payload information and will be referred to herein as the non-payload part.
Many current encryption solutions, such as IPsec, encrypt data far too indiscriminately, encrypting not only the payload but portions of the header or non-payload part as well. Only the routing information remains unencrypted, thereby rendering information needed for firewalls, proxies and/or NATs to appropriately relay the packets scrambled which will stop the packet before it can transit into a private or protected network. Other solutions provide inadequate encryption to assure integrity of the data across the public network. Some streaming media encryption solutions deal with the problem by placing the encryption system on the server machine in the form of special software that encrypts the streaming media before it is converted to packets for network transmission. While this results in packets that can often successfully pass through proxy servers and firewalls, this strategy is limiting. This is because this solution requires modification of the streaming server and often requires the addition of extra streaming servers or processing capacity to handle the increased workload that encryption adds. Also, the solution must be reengineered for each streaming server platform supported.