1. Technical Field
This invention relates generally to information retrieval in a computer network. More particularly, the invention relates to a method and system for reducing the amount of storage space required to support Web server content and for improving the manner in which such content is served in response to client requests.
2. Description of the Related Art
The World Wide Web is the Internet""s multimedia information retrieval system. In the Web environment, client machines effect transactions to Web servers using the Hypertext Transfer Protocol (HTTP), which is a known application protocol providing users access to files (e.g., text, graphics, images, sound, video, etc.) using a standard page description language known as Hypertext Markup Language (HTML). HTML provides basic document formatting and allows the developer to specify xe2x80x9clinksxe2x80x9d to other servers and files. In the Internet paradigm, a network path to a server is identified by a so-called Uniform Resource Locator (URL) having a special syntax for defining a network connection. Use of an HTML-compatible browser (e.g., Netscape Navigator or Microsoft Internet Explorer) at a client machine involves specification of a link via the URL. In response, the client makes a request to the server identified in the link and, in return, receives in return a document or other object formatted according to HTML. A collection of documents supported on a Web server is sometimes referred to as a Web site.
A conventional Web server uses a direct access storage device (DASD) to store content files. Such files are stored in an uncompressed state. A given server often supports many different Web sites, each identified by a URL. A URL may have associated therewith multiple subdirectories, with each such subdirectory comprising numerous files. Many of those files are large, rich media files that have significant storage overhead.
With the current usage of the Internet as a mechanism for distributing graphics-intensive files and other information (including software), many content intensive sites exhibit low performance in both loading and rendering on client machines. One primary reason for this problem is simply the size of the server files that must be downloaded to a given client machine in response to an HTTP request. The problem of increasingly slow server download delivery has been addressed historically by making large investments in high performance hardware (e.g., modems, faster servers, and the like). While this approach has ameliorated the problem to some degree, improving hardware performance is quite costly.
Increasing bandwidth is another approach to addressing slow server downloads. This approach also has its benefits, but the operator of an individual Web server (or server farm) has little control over the Internet backbone and other network devices that facilitate the actual transmissions).
It would be desirable to provide a server-centric approach to resolving prior art server storage and delivery problems.
It is a primary object of this invention to provide server-side methods for optimizing storage of server content and for dynamically serving such content in response to client requests.
It is another primary object of this invention to increase the delivery effectiveness of a server operating in a computer network.
Another object of this invention is to provide a mechanism for conserving storage space on a Web server and (assuming client support) for reducing the size of the data stream served to a client browser in response to an HTTP client request.
A more specific object of this invention to decrease DASD storage requirements on a Web server, thereby extending the useful life of the server""s existing hardware.
A more general object of this invention is to better organize content on a Web server for effective storage and delivery. As a by-product, owners and operators of Web server may receive the benefits of existing investments in their server hardware.
These and other objects of the invention are realized using a server-side mechanism together with an optional client-side decompression process. The server-side mechanism preferably comprises a pair of processes: a daemon process and a servlet process. The daemon process is a server-side executable that executes seamlessly and transparently to the regular operation and Web serving tasks on the host. The daemon process recursively compresses directories of content (HTML, graphics files, and the like) while the server, in parallel, serves content. When a target directory is completely compressed, the files which previously existed in an uncompressed state are either archived or deleted.
The servlet process interprets the compressed objects, resolving the connection between the client and server, and serves out the requested content. If the request originates from a client that is not enabled to decompress files, the servlet decompresses the requested files on-the-fly. The servlet also resolves connections when a client requests only partial content requested from within a compressed group of files. The daemon and servlet processes preferably operate asynchronously to each other.
When supported on a given client machine, the client process decompresses the streaming content for use on the client system. This functionality is implemented in a manner consistent with the architecture of the client browser.
Each of the processes of the present invention includes an application programming interface (API), namely, a program entry point, that allows the respective process to be extended by other software. As a result, different types of compression and their corresponding decompression routines are plug compatible with the architecture. Further, by interfacing through the API, the daemon and the servlet may readily support other markup language types (e.g., SGML, XML, HDML, and others) so that the inventive architecture is compatible with Internet appliances and other pervasive computing devices (e.g., palmtops, PDAs, cell phones, and the like) that do not include the full HTML function set.
The foregoing has outlined some of the more pertinent objects and features of the present invention. These objects should be construed to be merely illustrative of some of the more prominent features and applications of the invention. Many other beneficial results can be attained by applying the disclosed invention in a different manner or modifying the invention as will be described. Accordingly, other objects and a fuller understanding of the invention may be had by referring to the following Detailed Description of the Preferred Embodiment.