1. Field of the Invention
The present invention relates to a computer system, and deals more particularly with a method, system, and computer-readable code for receiving and demultiplexing multi-modal document content.
2. Description of the Related Art
xe2x80x9cMulti-modal document contentxe2x80x9d or xe2x80x9cMulti-modal Web contentxe2x80x9d refers to a Web page which contains multiple media types, also referred to herein as multiple modes or multiple content types. A xe2x80x9cWeb page,xe2x80x9d as is well known in the art, is a file or document created for use in the World Wide Web environment (hereinafter, xe2x80x9cWebxe2x80x9d). Web pages are typically located using a xe2x80x9cURL,xe2x80x9d or Uniform Resource Locator, which is a form of address adapted to use in the distributed network environment of the Web. Web pages are typically encoded in the HyperText Markup Language, or xe2x80x9cHTML.xe2x80x9d As an example of a Web page being xe2x80x9cmulti-modal,xe2x80x9d it is common for a single Web page to include text as well as graphics, images, sound files, and perhaps video. While images may be embedded directly within a textual Web page document (e.g. using the xe2x80x9c less than img greater than xe2x80x9d, or image, tag in HTML), content having other media types is typically linked to the textual document in a manner that requires the user to select a link reference (such as a hypertext link from the displayed textual Web page) before the linked content will be rendered to the user. An anchor, or xe2x80x9c less than a greater than xe2x80x9d tag, is used in HTML to provide this type of external link.
A user requesting a Web page uses a Web browser (which is a software application adapted to processing Web documents, such as an HTML browser which processes HTML documents) to generate a request for a Web page using its URL and to send the request to a Web server. The Web server then locates the requested content and returns it to the requesting browser. Upon receiving the requested document, the browser renders it for presentation to the user. The document text, when encoded in HTML format, is processed by an HTML parser and then displayed. (Text may be delivered in other formats, such as the Extensible Markup Language (XML), in which case a corresponding parser must process the encoded document before displaying it.)
The computing device on which the Web browser is executing typically has one or more xe2x80x9chelper applicationsxe2x80x9d installed, where these helper applications may comprise: an image processing application; an audio processing application; a video processing application; a text-to-speech generator (e.g. for use with documents encoded in the VoiceXML markup language); etc. The Web browser, upon detecting a content type which the browser is not prepared to render directly, automatically invokes the appropriate helper application to handle the received content. As an alternative to helper applications, applets or plug-ins may be used for processing multimedia files. Applets are small pieces of executable code that are typically downloaded to a user""s computer from a server through a network dynamically, as the code is needed for execution. Applets are often referenced from a Web document and may be used to process some part of that document. Plug-ins are small, special-purpose software applications adapted to particular processing needs. A plug-in may be used, for example, to process a file (such as a sound file) which a particular Web browser is not capable of processing. These techniques are well known in the art, and the software with which they are implemented is readily available on the market.
A Web server communicating with a Web browser using the HyperText Transfer Protocol (xe2x80x9cHTTPxe2x80x9d) typically returns a requested document to the browser as a two-part transmission. (Note that while the discussions herein refer to the HTTP protocol, this is for purposes of illustration and not of limitation. The Wireless Session Protocol, commonly referred to as xe2x80x9cWSP,xe2x80x9d may be used alternatively, as may other semantically equivalent protocols.) The first part is a header describing the returned document, and the second part is the document itself. Within the HTTP header of the first part is a xe2x80x9cContent-typexe2x80x9d entry, describing the content type of the document using the Multi-Part Internet Mail Extensions (xe2x80x9cMIMExe2x80x9d) notation. For example, if the document comprises text encoded in HTML, the content type will use the special syntax xe2x80x9ctextihtmlxe2x80x9d (as defined by the MIME standard). When the response includes multiple documents or document parts having multiple content types, then the HTTP header preferably uses the content type xe2x80x9cmultipart/mixedxe2x80x9d to indicate that a multipart message with data in more than one is being sent. Or, if the multiple parts are to be viewed simultaneously, the content type xe2x80x9cmultipart/parallelxe2x80x9d is preferably used. (Alternatively, the content types of xe2x80x9cmultipart/alternativexe2x80x9d or xe2x80x9cmultipart/digestxe2x80x9d may be used where appropriate. Refer to RFC (Request for Comments) 1521, xe2x80x9cMIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodiesxe2x80x9d and RFC 1522, xe2x80x9cMIME (Multipurpose Internet Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Textxe2x80x9d, for more information on MIME types.) This content type is interpreted by the receiving Web browser as it determines how to process and render the received Web document. When the xe2x80x9cmultipart/mixedxe2x80x9d content type is used in an HTTP header, it is followed by a keyword xe2x80x9cboundary=xe2x80x9d and some text string. This text string is also located within the returned document as a delimiter between the different document parts, and enables separation of the different content types in the document. For example, if the boundary string is defined as xe2x80x9cxe2x80x94abc123XYZ987xe2x80x9d then this string may be used to delimit parts of a document containing a JPEG image and ASCII text as shown below:
xe2x80x94abc123XYZ987
Content-type: image/jpeg
 less than  . . . the image content . . .  greater than 
xe2x80x94abc123XYZ987
Content-type: text/ascii
 less than  . . . some ASCII text . . .  greater than 
xe2x80x94abc123XYZ987xe2x80x94
Computing devices are becoming smaller and more specialized as computing becomes more pervasive in today""s world. Because of their increased portability, these smaller devices enable the user to perform computing functions regardless of where he happens to be at the time, and some allow a user to easily transport the device as the user moves about in his daily activities. Examples of this type of computing device include: Web-enabled cellular phones; wearable computing devices; devices mounted in a vehicle, such as an on-board navigation system; computing devices adapted to use in the home, such as an intelligent sensor built into a kitchen appliance; mobile computers; handheld computers such as the WorkPad from the International Business Machines Corporation (xe2x80x9cIBMxe2x80x9d); etc. (xe2x80x9cWorkPadxe2x80x9d is a registered trademark of IBM.) As computing devices become smaller and more specialized, however, the functions available on a particular device are fewer in number and typically more scaled-down in function. A Web-enabled cellular phone, for example, may be able to display only a small amount of text on its limited-size display screen while having no capability for processing image or video files. A wearable computing device, on the other hand, may be able to process sound files but not display text.
Many existing Web pages have been created with the expectation that they would be delivered to a full-function Web browser executing on a personal computer, with helper applications, applets, and/or plug-ins readily available for processing any content types included as part of the Web document. This is not necessarily the case as the smaller and more specialized computing devices are also capable of requesting and receiving Web documents. In the vehicle environment, for example, multiple devices may be available with each capable of processing a different combination of text, image, and sound; however, these disparate devices are unlikely to be integrated into a single unit. Instead, the devices are likely to be physically separate special-purpose devices. Consequently, a Web browser cannot simply route the received content to the appropriate renderers for the received content type(s) because those renderers are not coupled together. This is also true in the home networking environment where the home network may include: display-only devices spread strategically throughout the house, where these display devices may be unable to render streamed video data; modules in appliances that send and receive data such as status information (including equipment failure indicators) to other devices (such as personal computers) located in the home, where these modules are unlikely to have audio, image, video, or sometimes even text display capability. Finally, as previously discussed, a wearable computing device may be very limited in the types of content it can render.
These environments of specialized computing devices, each having different user interface capabilities, will become more commonplace in the near future. However, today""s Web model makes it impossible for a single document to simultaneously drive multiple user interfaces spread among these different devices. Accordingly, what is needed is a technique with which these devices can cooperate to render a multi-modal Web document.
An object of the present invention is to provide a technique whereby multi-modal document content can be received, demultiplexed, and distributed to one or more appropriate content renderers.
Yet another object of the present invention is to hide the physical identity of the content renderers from the server from which the document is retrieved. Another object of the present invention is to provide this technique in a manner whereby content renderers pre-register the content type(s) that they are capable of processing.
Another object of the present invention is to provide this technique in a manner whereby content rendering capabilities are dynamically determined by issuing a network query message.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a method, system, and computer program product for receiving and demultiplexing multi-modal document content. This technique comprises: providing a demultiplexing (demux) component; providing a plurality of content renderers coupled to the demux component via a network; generating a document request from a first client; sending the document request over an external network to a document server; receiving, at the demux component, a response document returned by the document server in response to the sent document request; locating at least one content type in the received response document; locating a selected one of the content renderers which is capable of rendering the at least one content type; and distributing a document content associated with the at least one content type to the located selected one content renderer.
The technique may further comprise repeatedly executing the location of the at least one content type, the location of the selected one content renderer,-and the distributing.
Or, the technique may further comprise receiving the distributed document content at the selected content renderer and rendering the received document content.
The generated document request may be generated as a HyperText Transport Protocol (HTTP) message, or it may be generated as a Wireless Session Protocol (WSP) message.
Locating the selected one of the content renderers may further comprise using the at least one content type to access a stored registry of content type to content renderer mappings. In addition, the technique may further comprise creating at least one entry in the stored registry of content type to content renderer mappings. This creation may comprise: sending, from the one or more devices on which the plurality of content renderers are executing, one or more content registration messages to the demux controller, each of the messages indicating a particular content type which the device is capable of rendering and an identifier of the device; receiving the registration messages at the demux controller; and using the particular content type and the device identifier from each of the received registration messages to create or update a corresponding entry in the registry. The registration messages may conform to a Universal Plug and Play protocol, or they may conform to a Jini protocol.
Alternatively, locating the selected one of the content renderers may further comprise: issuing a network query from the demux component, the network query specifying the content type; receiving the issued network query by one or more devices on which the content renderers are located; making a determination, by each of the one or more devices, whether to respond to the received query, wherein the determination by each particular one of the devices is based on one or more capabilities of the particular device and the content renderers located thereupon; sending a response to the received query from selected ones of the devices when the determination has a positive result; and receiving the responses at the demux component.
The technique may further comprise initiating a HyperText Transfer Protocol (HTTP) request from at least one of the one or more devices, the at least one device including a selected device on which the selected one of the content renderers is executing, and receiving the HTTP request at the demux component, thereby establishing an outstanding request from each of the at least one devices. In this case, distributing the document content may further comprise distributing the document content to the selected one on an open connection associated with a selected one of the outstanding requests, the selected one of the outstanding requests being that one initiated by the selected device.
Alternatively, the technique may further comprise initiating a Wireless Session Protocol (WSP) request from at least one of the one or more devices on which the plurality of content renderers are executing, the at least one device including a selected device on which the selected one of the content renderers is executing, and receiving the WSP request at the demux component, thereby establishing an outstanding request from each of the at least one devices. In this case, distributing the document content may further comprise distributing the document content to the selected one content renderer on an open connection associated with a selected one of the outstanding requests, the selected one of the outstanding requests being that one initiated by the selected device.
Distributing the document content may further comprise issuing a HyperText Transfer Protocol (HTTP) POST message to the selected one content renderer.