The present invention relates to delivery of multimedia content to client devices and, more particularly, to methods and apparatus for adapting such multimedia content for diverse client devices.
Web documents delivered on the Internet are multimedia presentations that may include video, images, graphics, text and audio. Due to the recent rapid growth of devices that are connected to the Internet, there is a growing demand for providing universal access to such multimedia content to a wide variety of devices over a wide range of network environments. For example, personal computers on a local area network (LAN), personal digital assistants (PDAs) on dial-up modems and smart cellular phones have drastically different client resources in terms of, for example, screen size, resolution, color depth, network bandwidth and computing power. Internet users also vary in their ability to pay for Internet services and in the time they are ready to wait for a page to download. Therefore, to provide universal access to the Internet, multimedia delivery methods need to account for the composite nature of Web documents, and the variety of client platform capabilities, user interests, network constraints and authoring policies.
In this context, video-conferencing systems have been proposed that adjust the bandwidth available to the client by selecting a suitable compression factor or codec. In these systems, only a single type of multimedia item (namely video) is considered. Also, clients that can not handle video are not considered.
One option for content adaptation is to manually develop multiple versions of multimedia content, each suitable for a class of client devices. Given the variety of client devices, it is difficult for content publishers to anticipate and accommodate the wide spectrum of client capabilities. For composite multimedia documents, such as Web pages, a number of systems have been proposed that employ a proxy between the Web server and the client. For example, various proxy approaches are described in: J. R. Smith, R. Mohan, and C-S. Li, xe2x80x9cTranscoding Internet Content for Heterogeneous Client Devices,xe2x80x9d In Proc. IEEE Inter. Symp. on Circuits and Syst. (ISCAS), Special Session on Next Generation Internet, June 1998; A. Fox, S. D. Gribble, E. A. Brewer, and E. Amir, xe2x80x9cAdapting to Network and Client Variability Via On-demand Dynamic Distillation,xe2x80x9d In ASPLOS-VII, Cambridge, Mass., October 1996; A. Ortega, F. Carignano, S. Ayer, and M. Vetterli, xe2x80x9cSoft Caching: Web Cache Management Techniques for Images,xe2x80x9d In IEEE Workshop on Multimedia Signal Processing, pg. 475-480, Princeton, N.J., June 1997; Intel Quick Web accessible on the Internet at http://www.intel.com/quickweb; Spyglass Prism accessible on the Internet at http://www.spyglass.com/products/prism; A. Fox and E. A. Brewer, xe2x80x9cReducing WWW Latency and Bandwidth Requirements by Real-time Distillation,xe2x80x9d In Proc. Of the 5th International WWW Conference, 1996; and T. W. Bickmore and B. N. Schilit, xe2x80x9cDigestor: Device-independent Access to the World Wide Web,xe2x80x9d In Proc. Of the 6th International WWW Conference, 1997. The proxy distills, or transcodes, the content from the Web server. This transcoding is primarily limited to the compression of images, or a reduction of their size or color space. These systems do not consider transcoding into different modalities. The image compression and size reduction policies are static and do not dynamically account for resources on the client.
The present invention adapts multimedia content, e.g., Web documents, to optimally match the capabilities of the client device requesting it. Each Web document is a set of items, each of which is authored in a particular modality such as text or image. Each of these content items is then transcoded into multiple resolution and modality versions so that they can be rendered on different devices. For example, a video item is transcoded into a selected set of images so that it can be rendered on a device not capable of displaying video. Each version of a content item requires different resources from the client device. The invention ensures that the resource requirements for the entire document, as given by the sum of the resource requirements of its constituent items, can be met by the requesting client. The invention allocates the resources on the client among the items in the document. This resource allocation results in the selection of appropriate resolution or modality of the content items. If the client has limited resources, e.g., such as a PDA or pager, some of the content items may not get any resources assigned and thus not be delivered to the client.
In an embodiment of the invention, as will be explained, three technologies are employed to provide such multimedia content adaptation: (i) a progressive data representation scheme referred to as the InfoPyramid as described in C-S. Li, R. Mohan and J. R. Smith, xe2x80x9cMultimedia Content Description in the InfoPyramid,xe2x80x9d Proc. ICASP""98, Special Session on Signal Processing in Modern Multimedia Standards, Seattle, Wash., May 1998, the disclosure of which is incorporated herein by reference; (ii) a set of transcoding modules for converting modality or resolution; and (iii) an adaptation process that selects the best representation to meet the client capabilities while delivering the most value to the client.
The present invention provides many advantages over prior solutions. For example, content is dynamically adapted to the client device allowing a wider variety of multimedia content and of client devices to be properly supported. Also, in accordance with the invention, a content author has control over the adaptation process. The content author can edit and replace the transcoded versions of content items generated by the automated transcoding systems. This control of the customization overcomes problems of publisher control and copyright issues faced by transcoding proxies.
Further, the invention permits content to be authored in XML (Extensible Markup Language, as is known in the art), allowing the author to provide more information to the transcoding and adaptation systems than can be deduced from an HTML (hyper text markup language) page. One benefit of the server-based system of the invention is that due to the guidance provided by the author, a significantly greater level of customization can be performed than is possible in previous transcoding proxies. Still further, the invention permits the transcoded versions of the content items to be generated prior to any requests. Thus, the invention can handle media items such as video and audio which are difficult to handle in conventional proxies. This off-line transcoding also leads to lower response latencies than proxies. Also, the server shares the benefit of transcoding proxies in speeding content delivery as the customized content is often much smaller than the original content.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.