An important objective in delivering content in a network is to deliver highly desired segments of contents on a timely basis, preferably immediately and without latency. For the purposes of describing embodiments of the present invention, a network is any type of communication network including, without limitation, the Internet, intranets, a mobile wireless network, a wired network, a wireless network, a metropolitan area network, a local area network and a 3G mobile wireless network and any combinations of such networks. Latency is the time delay between the end-user inputting a request for content from a network and receiving a reply from the network. A segment is a portion of an original content, smaller in size and/or duration than the original content.
For the purposes of describing embodiments of the present invention, content includes multimedia data (also referred to herein as media data or media content) as exemplified by video data accompanied by audio data. In common terms, a multimedia data may be a movie with soundtrack, audio-based data, image-based data, Web page-based data, graphic data and the like, and any combinations thereof. Content may include data that is or is not encoded (compressed), encrypted or transcoded. For purposes of clarity and brevity, the following discussions and examples sometimes deal specifically with video data; however, the embodiments of the invention are not limited to use with video data but embrace all content.
In delivering content on networks, four end-user behavior patterns have been observed. Firstly, it has been observed that some contents are requested more frequently than others. For the purposes of describing embodiments of the present invention these content are referred to as “high user-activity content”.
Secondly, it has been observed that even with high user-activity content, when content is received by the end-user, only the beginning portion of the content are viewed. For the purposes of describing embodiments of the present invention these segments are referred to as “introductory” (or “leading” or “hot”) segments.
Thirdly, after viewing the introductory segments of high user-activity content, the end-user may decide to view additional segments, or abandon the content in its entirety at that stage.
Fourthly, the abandonment rate after viewing the introductory segments is high.
In managing content on a network, caching of content in the network is a known technique for expediting delivery of content to end-users. Caching is the placement in the network of “edge servers”, also referred to as caching proxies or caches, closer to the end-user. By placing content in edge servers, the data-path length between the server and the end-user is shortened, thus shortening the wait-time between requesting content and receiving a reply, i.e., latency is shortened. Further, by placing the content in edge servers, the load on the origin server is reduced because the origin server in not receiving every request. Hence, the origin server can service other requests more responsively as the requests for content can be handled by the edge server.
In the prior art, it has been the practice for caches in the network to cache the entire content in the original format as received from the content server, or cache the content in a transcoded format, also in its entirety. Transcoding is often necessary to cope with the variation in size of computers and other devices on the network. These devices can, for example, display Web pages and video streams, playback stored audio, and/or make any network connection that a device on the network can make. Usually, a transcoding proxy is used to transform content, originally formulated for a full-size display, to content more appropriate to form factors for a wide range of devices. For example, the transcoding proxy can take the original content from the content server and change it to accommodate screen size, data rates of transmission, or any other transformation that is appropriate.
Caching contents in their entirety in edge servers works well when the size of the content is relatively small compared to the storage capacity of the edge servers. For example, a Web page which is relatively small (at much less than a megabyte in size), can be easily cached without consuming a substantial portion of the storage capacity.
However, caching contents in its entirety in the edge servers is problematic when the content is large compared to the storage capacity, and/or the content has a long playback time, e.g. multimedia data. Thus, transcoding and caching multimedia data such as DVD videos in their entirety in an edge server is not practical as the content will quickly consume the limited storage capacity.
The problem of caching large content in its entirety in a edge server is exacerbated when considering that, typically, a multiplicity of different content objects are needed to be cached, each possibly having multiple versions. Different versions may exist because of the need to accommodate a variety of network connections utilized by end-users. In addition, different versions may exist to accommodate the different capabilities of different types of client devices (e.g., desktops, laptops, personal digital assistants, cell phones, etc.). Also, different classes of devices typically have different processing and display capabilities. For example, while a personal digital assistant can receive and display a streamed video, it does not have the processing and display capabilities of a desktop. Accordingly, a reduced bitrate/reduced resolution version of the video is produced for use on the personal digital assistant, while a higher bit rate and higher resolution version is produced for the desktop.
Further, caching large content in its entirety in an edge server, whether in an original or transcoded format, is not an efficient use of cache storage especially if the content is not a high user-activity content, also if the rate of abandoning the content is high, which is typically the case, as previously noted.
Accordingly, there is a need for a more efficient way of caching and expediting delivery of content in a network.