1. Field of the Invention
The present invention relates to a computer system, and deals more particularly with a method, system, and computer-readable code for dynamicall constructing page bundles on demand at a server, and downloading the bundles to a requesting client that may be portable and may have intermittent network connectivity. The bundles may then be accessed at the client, without requiring an on-going networke connection. A bundle may included one or more part a servlets, enabling dynamic content generation at the client.
2. Description of the Related Art
It is commonplace today for computer users to connect their machines to other computers, known as xe2x80x9cservers,xe2x80x9d through a network. The network may be a private network, such as a corporate intranet of networked computers that is accessible only to computer users within that corporation, or it may a public network, such as the Internet or World-Wide Web. The Internet is a vast collection of computing resources, interconnected as a network, from sites around the world. The World-Wide Web (referred to herein as the xe2x80x9cWebxe2x80x9d) is that portion of the Internet which uses the HyperText Transfer Protocol (xe2x80x9cHTTPxe2x80x9d) as a protocol for exchanging messages. (Alternatively, other protocols such as the xe2x80x9cHTTPSxe2x80x9d protocol can be used, where this protocol is a security-enhanced version of HTTP.)
The user may connect his computer to a server using a xe2x80x9cwirelinexe2x80x9d connection or a xe2x80x9cwirelessxe2x80x9d connection. Wireline connections are those that use physical media such as cables and telephone lines, whereas wireless connections use media such as satellite links, radio frequency waves, and infrared waves. Many connection techniques can be used with these various media, such as: using the computer""s modem to establish a connection over a telephone line; using a Local Area Network (LAN) card such as Token Ring or Ethernet; using a cellular modem to establish a wireless connection; etc. The user""s computer may be any type of computer processor, including laptop, handheld or mobile computers; vehicle-mounted devices; cellular telephones and desktop screen phones; peripheral (e.g. printer, fax machine, etc.); desktop computers; mainframe computers; etc., having processing and communication capabilities. The remote server, similarly, can be one of any number of different types of computer which have processing and communication capabilities. These techniques are well known in the art, and the hardware devices and software which enable their use are readily available Hereinafter, the user""s computer will be referred to as a xe2x80x9cworkstation,xe2x80x9d and use of the terms xe2x80x9cworkstationxe2x80x9d or xe2x80x9cserverxe2x80x9d refers to any of the types of computing devices described above.
A user of the Internet typically accesses and uses the Internet by establishing a network connection through the services of an Internet Service Provider (ISP). An ISP provides computer users the ability to dial a telephone number using their workstation modem (or other connection facility, such as satellite transmission), thereby establishing a connection to a remote computer owned or managed by the ISP. This remote computer then makes services available to the user""s computer, hence such computers are often referred to as xe2x80x9cservers.xe2x80x9d Typical services include: providing a search facility to search throughout the interconnected computers of the Internet for items of interest to the user; a browse capability for displaying information located with the search facility; and an electronic mail facility, with which the user can send and receive mail messages to and from other computer users. Similar facilities are typically available when the user connects to a server in an intranet or an extranet (that is, a network owned or managed by another company and which provides services in a similar manner to the Internet or an intranet).
The user working in a networked environment will have software running on his workstation to allow him to create and send requests for information to a server and to see the results. When the user connects to the Web, these functions are typically combined in software that is referred to as a xe2x80x9cWeb browser,xe2x80x9d or xe2x80x9cbrowser.xe2x80x9d After the user has created his request using the browser, the request message is sent out into the Internet for processing. The target of the request message is one of the interconnected servers in the Internet network. That server will receive the message, attempt to find the data satisfying the user""s request, format that data for display with the user""s browser, and return the formatted response to the browser software running on the user""s workstation. The response is typically in the form of a display, referred to as a xe2x80x9cWeb page,xe2x80x9d that may contain text, graphics, images, sound, video, etc. The user will also typically have an electronic mail (xe2x80x9ce-mailxe2x80x9d) software package installed on his workstation, which enables him to send and receive e-mail to and from the workstation of other computer users. Additionally, the user may have software on his workstation that supports sending requests to, and receiving responses from, automated file delivery services. For example, the File Transfer Protocol (xe2x80x9cFTPxe2x80x9d) may be used to retrieve a file stored in a remote location to the user""s workstation.
These are examples of a client-server model of computing, where the machine at which the user requests information is referred to as the client, and the computer that locates the information and returns it to the client is the server. In the Web environment, the server is referred to as a xe2x80x9cWeb server.xe2x80x9d The client-server model may be extended to what is referred to as a xe2x80x9cthree-tier architecture.xe2x80x9d This architecture places the Web server in the middle tier, where the added tier typically represents data repositories of information that may be accessed by the Web server as part of the task of processing the client""s request. This three-tiered architecture recognizes the fact that many client requests do not simply require the location and return of static data, but require an application program to perform processing of the client""s request in order to dynamically create and format the data to be returned. In this architecture, the Web server augmented by the component performing this processing may be referred to as an xe2x80x9capplication server.xe2x80x9d
As more people connect their workstations to the Web, the number of messages and files being sent is skyrocketing. (Hereinafter, the terms xe2x80x9cmessagexe2x80x9d and xe2x80x9cfilexe2x80x9d are used interchangeably when referring to data being sent through a network, unless otherwise stated.) Coupled with this increase in the number of network users and files is an increase in the size of the files commonly being sent. For example, a short e-mail message with a relatively simple graphic image attached may be on the order of several hundred thousand bytes of data. Users may send and receive many such files over the course of a day""s work or in their personal network communications.
A great deal of user frustration can result when trying to access popular Web sites which must service an ever-increasing number of user requests, and which often have slow response times due to this heavy request load. Additionally, long delays may result when users request delivery of large files to their workstation (or even when requesting relatively small files from congested servers), creating yet more user frustration. The popularity of using portable computers such as handheld devices for connecting to the Internet, or other networks of computers, is increasing as user interest in computing becomes pervasive and users are more c often working in mobile environments. At the same time, the popularity of making network connections using connection services that charge fees based upon the duration of connections (such as cellular services, which are commonly used for wireless connections from portable computers) is also growing. When using this type of relatively expensive connection, the longer the user must wait to receive a file, the higher his connection charges will be. Wireless connections also tend to have high network latencies, due to the limited bandwidth available and the extra network hops (e.g. gateways) that are involved with wireless transmission. As a result, a user may have to wait a relatively long time to receive a response to a request he has sent into the network. These are some of the factors behind an increasing tendency of Web users to work offline with Web pages, whereby the user selects pages for downloading to his workstation from a Web server and then uses a browser to view this local copy of the pages after having disconnected from the network.
When a user is interacting with the Internet, the browser running on the user""s workstation typically accepts the data it will display in response to the user""s request as a data stream formatted using the HyperText Markup Language (xe2x80x9cHTMLxe2x80x9d). HTML is a standardized notation for displaying text and graphics on a computer display screen, as well as for providing more complex information presentation such as animated video, sound, etc. When browsers expect an incoming response to be formatted using HTML, servers generate their response in that format. The browser processes the HTML syntax upon receipt of the file sent by the server (or from parsing a local copy of the file, when working offline), and renders a Web page according to the instructions specified by the HTML commands. Browsers are also commercially available for notations other than HTML that are used for specifying Web content. Common examples of these other notations are the Extensible Markup Language (xe2x80x9cXMLxe2x80x9d), and pages represented in other standard formats such as the Wireless Markup Language (xe2x80x9cWMLxe2x80x9d).
Web pages were originally created to have only static content. That is, a user requested a specific page, and the predefined contents of that page were located by a Web server and returned for formatting and display at the user""s computer. To change the page content or layout, the HTML syntax (or other notation) specifying the page had to be edited. However, the Web is moving toward dynamic page content, whereby the information to be displayed to the user for a given page can be generated dynamically when each request is received at the server.
With dynamically-generated content, a request for the Web page stored at a given Uniform Resource Identifier (xe2x80x9cURIxe2x80x9d) or Uniform Resource Locator (xe2x80x9cURLxe2x80x9d) may result in a wide variety of page content being returned to the user. (References to xe2x80x9cURLxe2x80x9d hereinafter are intended to include URIs unless stated otherwise.) One common, simple use of dynamic page content is the xe2x80x9cvisitor countsxe2x80x9d which are often displayed on Web pages, with text such as xe2x80x9cYou are the 123rd visitor to this site since Jan. 1, 1997xe2x80x9d (where the count of visitors is accumulated at the server and inserted into the HTML syntax before returning the page to the user). Other simple uses include displaying the current date and time on the page. More advanced techniques for dynamic content allow servers to provide Web pages that are tailored to the user""s identification and other available information about the user. For example, servers providing travel reservation services commonly store information about the travel preferences of each of their users and then use this information when responding to inquiries from a particular user. Dynamic content may also be based upon user classes or categories, where one category of users will see one version of a Web page and where users in another category will see a different versionxe2x80x94even though all users provided the same URL to request the Web page from the same server. For example, some Web server sites provide different services to users who have registered in some manner (such as filling out an on-line questionnaire) or users who have a membership of some type (which may involve paying a fee in order to get enhanced services, or more detailed information). The difference in dynamic content may be as simple as including the user""s name in the page, as a personalized electronic greeting, or the dynamic content may be related to the user""s past activities at this site. On-line shopping sites, for example, may include a recognition for repeat shoppers, such as thanking them for their previous order placed on some specific day or offering a special limited-availability discount.
A number of techniques for providing dynamic page content exist. One such technique is use of an Active Server Page (xe2x80x9cASPxe2x80x9d) on a Microsoft Web server, which detects a specific command syntax in an HTML page and processes the embedded commands before returning the page to the user. Another technique is the use of servlets, which are executable code objects that can be dynamically invoked by the Web server to process a user request. Servlets typically perform some specialized function, such as creating page content based on dynamic factors. Or, Dynamic Server Pages (xe2x80x9cDSPsxe2x80x9d) or Java Server Pages (xe2x80x9cJSPsxe2x80x9d) may be used to create dynamic content using compiled Java on Java-aware Web servers. (xe2x80x9cJavaxe2x80x9d is a trademark of Sun Microsystems, Inc.) CGI (xe2x80x9cCommon Gateway Interfacexe2x80x9d) scripts and applications may also be used as sources of dynamic content.
Software programs known as xe2x80x9cdata miningxe2x80x9d applications deduce patterns and/or relationships from data stores such as databases using statistical analysis techniques. One common usage of data mining is to track user behavior patterns when accessing a Web server. By monitoring sequences of requests, the software may deduce a user""s request patterns over time and may also infer a user""s future behavior using these deduced patterns. As a simple example, suppose a user requests to view an on-line television schedule from a server which begins by requesting the user""s zip code, and then offers a selection of (1) broadcast and cable providers in that zip code, and (2) viewing time periods within the day. If the user always requests the same zip code, the same cable provider, and the evening prime-time viewing hours, a data mining application may detect this pattern and establish it as an automatic default for this user. By monitoring request patterns in this way, the server applications can provide customized treatment for repeat viewers, eliminating the annoyance that results when the user has to repeatedly enter the same data upon each visit, while still allowing new and repeat visitors the full flexibility of options from which to select. More complex patterns can also be detected by data mining, including which page(s) a particular user is likely to request during a specific type of interaction; the page sequence most often followed by new users at a particular site; whether a different page sequence is preferred by users who have accessed the site more than some ascertainable number of times (skipping introductory material, for example); etc. As electronic commerce becomes more prevalent on the Web, and electronic businesses become increasingly competitive, tracking user behavior patterns in this manner will be increasingly valuable and commonplace. Examples of data mining software products that are commercially available include xe2x80x9cSurfAidxe2x80x9d and xe2x80x9cIntelligent Minerxe2x80x9d from IBM. Refer to the Web site xe2x80x9cnetmining.dfw.ibm.comxe2x80x9d for more information about SurfAid, and xe2x80x9cwww.software.ibm.com./data/iminerxe2x80x9d for more information about Intelligent Miner, or contact your local IBM branch office. (xe2x80x9cIntelligent Minerxe2x80x9d is a trademark of IBM.)
In the presence of these factors, computer users need a way to work offline efficiently, viewing and interacting with Web pages without the expense and processing delays that occur with a network connection, while still being able to perform productive work. Users often have no way of knowing which pages they need for their offline work, especially when one page may provide links to many other pages, and thus may find it difficult to determine which pages should be downloaded if they wish to work in this mode. If one or more pages is needed during the offline interaction that was not downloaded during the connection, the user will find that he cannot complete his intended work without making another network connection to retrieve missing pages. More than one additional network connection may be required, if the user still fails to correctly predict the pages he needs in a subsequent download operation. As the level of xe2x80x9ccomputer savvyxe2x80x9d of the average Internet user decreases with the expansion of Internet usage into the general public, an average user is decreasingly likely to be able to accurately pre-select a complete subset of Web pages for offline viewing.
Accordingly, a need exists for a technique by which multiple Web pages can be dynamically bundled (i.e. packaged) and downloaded for accessing on a user""s workstation, enabling the user to perform a meaningful interaction even in the absence of an ongoing network connection. The proposed technique uses an on-demand bundling approach, ensuring that a requesting user will receive the most recent versions of any bundled files. The proposed technique often serves to reduce the number and duration of network connections required, enabling a user to work productively while offline. Further, the bundle may optionally contain executable code such as one or more servlets, which will execute on the user""s workstation to enable-dynamic content generation. Messages may be created and queued during processing of the downloaded bundle, for sending to a server when the user subsequently establishes a network connection. Optionally, data mining software may be used advantageously with this technique, to increase the likelihood of constructing a bundle that will meet the user""s needs throughout the offline interaction. Additionally, transcoding may optionally be performed on a bundle destined for a particular user to tailor the bundled software to the user""s current working environment.
An object of the present invention is to provide a technique with which multiple Web pages can be dynamically bundled and downloaded for accessing on a user""s workstation, enabling the user to perform a meaningful interaction even in the absence of an ongoing network connection.
Another object of the present invention is to provide a technique whereby this bundling occurs on demand, ensuring that a requesting user will receive the most recent versions of any bundled files.
It is a further object of the present invention to provide a technique whereby the number and duration of network connections required is reduced, enabling a user to work productively while offline.
It is another object of the present invention to provide a technique whereby the bundle may contain executable code such as one or more servlets, which will execute on the user""s workstation to enable dynamic content generation.
It is a yet another object of the present invention to provide a technique whereby messages may be created and queued during processing of the downloaded bundle, for sending to a server when the user subsequently establishes a network connection.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a software-implemented technique for use in a computing environment capable of having a connection to a network for enabling offline Web page processing, comprising: receiving a request for a Web page bundle at a server in the network; dynamically constructing the Web page bundle; and downloading the dynamically constructed Web page bundle. Preferably, the dynamically constructing further comprises: accessing a repository wherein a plurality of bundle descriptors are stored; determining if one of the bundle descriptors matches the request; using the matching bundle descriptor to locate and retrieve one or more stored files referenced therein when the determining has a positive outcome; locating and retrieving a single file specified by the request when the determining has a negative outcome; and formatting the located and retrieved files into the dynamically constructed bundle. Using the matching bundle descriptor preferably further comprises locating and retrieving at least one servlet capable of creating dynamic content. A content-reducing transformation may optionally be applied to one or more of the located and retrieved files prior to the formatting. The dynamically constructing may further comprise using results of a data mining operation, and/or locating and using embedded page references.
The present invention also provides a software-implemented technique for intercepting a user request for a page, the intercepting operating on a client in the network; determining if the page is stored locally; retrieving the requested page from a local storage when the determining has a positive outcome; sending a page bundle request to a server in the network when the determining has a negative outcome; receiving the requested page bundle; storing the received page bundle; and delivering the requested page to the user. Optionally, delivering the requested page may further comprise locating and executing at least one servlet capable of creating dynamic content.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.