The World Wide Web (“Web”), the leading information retrieval service of the Internet (a worldwide network of interconnected computers) has quickly grown into a widely accepted vehicle for the dissemination of information. The explosive growth of the Web is due, in part, to the enormous volume of information available, along with the ease with which the information can be accessed by Web users.
Presently, individuals, businesses, and government offices alike regularly rely on the Web for their information needs on a daily, if not minute-by-minute, basis. E-mail and electronic access to newspapers, financial reports, journals, and various specialized computer databases are just a few of the many convenient uses afforded by the Web to its users. With numbers of Web servers linked to the Web expanding daily, the advent of faster connection speeds, and the ever increasing dependency on the Web by corporate, government and private interests, the tremendous growth of the Web can be expected to continue for years to come.
Typically, users in an office setting access the Web via a workstations linked to a network architecture routed to the Web, while individuals in home setting access the Web using a modem in a Point-to-Point Protocol (PPP). A schematic of a typical network architecture 10 is shown in FIG. 1 in which multiple workstations 12 are bi-directionally connected to a local area network (LAN) server 16 via a connective path 15, which may include twisted-pair wire, coaxial cable, fiber optic, radio wave or other wireless transmission, or other interconnection types. Each workstation 12 is configured with its own CPU with which it executes software programs, such as Web browser programs, stored on the workstation or on other accessible devices. Workstations 12 are capable of generating data files which can be internally stored or transmitted to LAN server 16 for further disposition. Workstation 12 is also capable of receiving data files from LAN server 16.
Workstations 12 are typically provided with a network interface card (“NIC”) 14 which is designed for the particular type of network, and which contains protocol rules and encoding specifications for sending and receiving data to and from LAN server 16, and to and from any other devices linked to LAN server 16. The protocols specified by NIC 14 determine, for example, the type of error checking to be used, the data compression method (if any), the means by which the sending device will indicate it has finished sending a message, and the means by which the receiving device will indicate it has received a message. Thus, by means of the network interconnections and the NIC 14, each workstation 12 can access data and devices also interconnected with LAN server 16. LAN server 16 is shown connected to a network peripheral device also containing an NIC 14, in this case a printer 18, which may be shared by workstations 12 via connections to LAN server 16 (i.e., printer 18 is a shared resource). A workstation 12 may also have a printer 18 directly connected to it.
LAN server 16 typically comprises a specially configured computer or device adapted for managing various resources connected to the LAN server 16 and network architecture 10. Many servers are “dedicated” in that they perform no other tasks besides their server tasks. For example, servers may be dedicated to managing network traffic (network servers), disk drives (file servers), or printers (print servers). One or more servers of varying types are typically found in a office-type network architecture.
In relation to LAN server 16, protocols specified by NIC 14 determine whether the network 10 uses a peer-to-peer or client/server architecture. Most commonly, a client/server architecture is used. Typically, “clients” are applications that run on workstations 12 and rely on a server to perform in certain operations. For example, an e-mail client is an application that enables a workstation 12 to send and receive e-mail via LAN server 16 and an e-mail server 28, 29. Clients may rely on servers for interconnection with other devices, web access, resources (such as files), and in some cases, processing power.
In a typical network architecture 10, a LAN server 16 supports a Hyper Text Transfer Protocol (HTTP), the underlying communications protocol used by the Web. HTTP is also supported by NICs 14 on workstations 12, and defines how messages are formatted and transmitted on the Web, and what actions Web servers 24 and Web browsers should take in response to various commands. HTTP uses the Transmission Control Protocol (TCP) to transport all of its control and data messages from one computer to another. Although HTTP is most commonly used, Web browsers also may send network requests for information using other protocols, such as Simple Mail Transfer Protocol (SMTP) (commonly used to support e-mail clients), Gopher document protocol, File Transfer Protocol, etc.
Web servers 24 are remotely located on the Web 26 and are based on the client/server model, being responsible for storage and responsive retrieval of text, graphics, and other information. Well-known files which are typically transferred between a Web browser and a Web server 24 include Hyper Text Markup Language (“HTML”) files, the HTML dictating how Web pages are to be formatted and displayed on a Web browser. Other common file types include Graphical Interchange Format (“GIF”) files (bit-mapped graphics files found on the Web that use data compression techniques), Joint Expert Photographic Group (“JPEG”) files (a lossy compression technique for color images), and Java applets (small platform independent Java-encoded applications that may be downloaded from a Web server to run on a workstation having a Java-compatible Web browser).
LAN server 16 is also connected to router 22, which provides a connection and access to the Web 26 for LAN server 16, and its interconnected workstations 12. Router 22 primarily functions to monitor Web functions by routing data (IP-packets) generated by workstations 12 through an HTTP server (not shown), and also by distributing data intended for workstations 12 or other devices on LAN server 16. An HTTP server may be provided as part of local network architecture 10, or more commonly by an Internet Service Provider (“ISP”) to the Web 26. By requesting a specific network address linked to the Web, router 22 can transmit data between a workstation 12 and a particular Web server 24 linked to the Web network 26 using the HTTP protocol.
For e-mail transmission, an e-mail client on a workstation 12 typically interacts with an SMTP server 28, which directs outgoing mail through the Web 26 to the Domain Name Server (DNS) specified by a particular email address. A POP3 (Post Office Protocol) server 29 receives and distributes incoming e-mail according to commands issued by a recognized e-mail client which has successfully logged on to it. Although two different servers, SMTP server 28 and POP3 server 29 commonly run on a single machine (hereinafter referred to as an “e-mail server”). As in the case of an HTTP server, SMTP server 28 and POP3 server 29 may be provided as part of a local network architecture 10, or by a remotely located internet service provider.
To access the Web 26, a user at a workstation 12 typically activates a graphical user interface software application used to locate and display Web pages, commonly called a Web browser, which will usually be stored on the hard drive of workstation 12. Well-known Web browsers include Netscape's Navigator® and Microsoft's Internet Explorer®. The activation of the Web browser on a workstation 12 initiates a suitable network connection to LAN server 16 and router 22 which, in turn, establish a connection to an HTTP server and the Internet (Web) 26.
To connect to a desired website having retrievable information, a network address designated a Uniform Resource Locator (URL) is entered into the Web browser. The URL identifies both the location of the site and one or more pages of information contained at that site, which is supported by a particular Web server 24. At each URL, one or more Web pages of text, graphics, or other information is stored on Web Server 24 in a pre-defined hierarchy. The URL address may be supplied by the user in variety of ways, to include direct keyboard entry of the address, selection of a previously stored “bookmarked” address, or “clicking” on an appropriate hyper-text link appearing on a Web browser control bar or on a displayed Web page. Using the URL, the Web browser sends a command in the form of a retrieval request to the proper Web server 24 identified in the URL address. For example, when a URL is entered into a Web browser, the Web browser sends an HTTP command to Web server 24 directing Web server 24 to fetch (download) and then transmit the requested data (Web page) identified by the URL.
The instructions sent by the Web browser are interpreted and executed by Web server 24. The requested data is then transmitted by Web server 24 over the Web back to router 22 and LAN 16, where the file is distributed over the local network to the requesting workstation 12. The Web browser then assembles and displays the file by processing, for example, the HTML source code, allowing the user to view the text, graphics or other information rendered on the local display of workstation 12.
Web servers 24 are increasing included as a part of office network architectures where they can be locally maintained and conveniently upgraded, rather being situated at distally located ISPs. Normally, a Web server 24 will be dedicated as a stand-alone server with a direct connection to the Internet. A recent variation in the traditional use of Web servers, however, is disclosed in U.S. Pat. No. 5,956,487 to Venkatraman et al. (“Venkatraman et al.”), assigned to the assignee of the present invention. Venkatraman et al. describes a Web-based server configured to be embedded on networked devices, including printers, faxes, copiers, communication devices, etc. The embedded Web server is disclosed to provide Web and network-based access to the device through a device URL and associated device Web pages, which may be conveniently accessed by conventional Web browsers. In addition to eliminating the costs of providing a screen-based user interface mechanism, the embedded Web server advantageously makes use of HTML transported according to HTTP, which enables communication with a Web browser independently of the user's particular computer system platform executing the browser. Accordingly, the embedded Web browser disclosed by Venkatraman et al. allows convenient remote user access of a configured device by means of the Web.
In accessing remotely located Web servers, however, the time it takes for the data to be transmitted over the Web may vary substantially, being principally dependent upon the bandwidth (“pipeline”) of the internet connection, the size of the files, and the processing speed of the Web server. Bandwidth and file size are particularly determinative of data transmission speed, bandwidth referring to the amount of data that can be transmitted over the Web network in a fixed amount of time. Bandwidth is usually expressed in bits per second (bps) or bytes per second, and is directly proportional to the amount of data transmitted or received per unit time. For example, more bandwidth would be used to download a digitized photograph in one second than it would take to download a page of text in one second.
The bandwidth of a network connection, being a finite value, can dramatically affect transmission speeds in the situation where multiple users, such as multiple users on workstations 12, are accessing the Web during the same time frame. While bandwidth is less of an issue when multiple users are accessing the Web intermittently, resulting in sporadic transmissions of data, problems may arise in the event that one or more users requests large amounts of data to be sent in a steady and continuous stream. Such “streaming” may result, for example, when particularly large files are downloaded, such as graphics files, software programs, large textual documents, electronic newspapers, etc., and in the case when one or more users is retrieving large multi-media files. Furthermore, many businesses, offices, and individuals share bandwidth with other non-associated Web users (i.e., have non-dedicated internet connections), further reducing the bandwidth available for data transmission. In some settings where dependence on the Web is high, the slow data transmission speeds that may occur as a result of network congestion might critically affect productivity, or even lead to missed business opportunities. In a business setting using a less than optimum bandwidth, the problem of network congestion is exacerbated by the fact that the majority of Web users will be connected via the same bandwidth “pipeline” during peak business hours (9 am–5 pm).
Even assuming a bandwidth sufficient for multiple simultaneous use and large downloading operations, the serial interaction between a conventional Web browser and a user results in a time lag that is fairly inefficient and wasteful of both processing and worker productivity time. Specifically, the user takes time and effort to activate the Web connection and input the desired URL. The Web browser then processes the user input and sends out the proper data request. If the requested file is large, the user must typically wait some time until the information is transmitted, routed, received by a Web server, processed by the Web server, transmitted by the Web server, assembled by the browser, and then displayed on the user's workstation monitor.
U.S. Pat. No. 6,134,584 to Chang et al. (“Chang”) and U.S. Pat. No. 6,067,565 to Horvitz (“Horvitz”) disclose downloading techniques for dealing with network congestion. Chang describes a method in which an “Internet Data Download Scheduler” (“Scheduler”) is provided on a user's machine to download web-based documents at a specified scheduled time. The Scheduler of Chang is further disclosed to enable a user to download under prescribed bandwidth conditions, and to allow the machine to be activated from an “off” status in order to initiate the download process. Chang, however, provides little detail as to how the scheduling methods are to be accomplished.
Horvitz discloses techniques of “prefetching” web pages during periods of low processing and low network activity, and then storing those pages in the cache of a user's machine for expedited access. The “prefetched” pages are those computed to be likely to be accessed in the future based on analysis of a user's Internet usage. The methods of Horvitz, however, are based on a probabilistic user model, and do not allow a user to specify or schedule a particular desired download.
If the user desires to print a hardcopy of the information, the process is accordingly more complex and takes additional time. To print, a user at workstation 12 uses driver software to load documents or images into a buffer (usually an area on a disk of a workstation), where a printer 18 pulls them off the buffer at its own rate (See FIG. 1). The user may also use the driver software to manually adjust the print attributes, such as the image options (e.g., resolution, background), image orientation (landscape, portrait, etc.), colors, number of copies, duplex (double-sided) or single-sided printing, or other printing parameters.
A network printer 18, however, can print only one job at a time, yet must be available at all times to multiple users. For this reason, local network architectures 10 will typically incorporate a dedicated print server, also known as a print spooler 20. Print spooler 20 is a device that accepts requests for printer resources, and then allocates the use of printer resource according to a set of specified rules. When a print spooler 20 is present, users can send their data through the print spooler 20 rather than directly to the printer 18.
Upon receiving data destined for one or more printers 18, print spooler 20 writes this data into a temporary file instead of sending it immediately to a printer. For example, when a single user or multiple users initiate commands on workstation(s) 12 to print a number of documents, print spooler 20 queues the documents by placing them in an interim holding area called a print buffer or print queue. The printer then pulls the documents off the queue one at a time. Later, when the printer becomes available, the print spooler 20 will write the data to the printer.
The order in which a print spooler 20 executes jobs on a queue depends on the priority system being used. Most commonly, jobs are executed in the same order that they were received on the queue (i.e., on a first in first out basis), but in certain jobs can be given higher priority dependent upon the particular system scheme. Typically, a particular print job will remain on queue until printed, at which time incoming print jobs may be allow to overwrite the print request. Spooling thus lets multiple users place a number of print jobs on a queue instead of waiting for each one to finish before specifying the next one. The operating systems of individual workstations are also often configured with print spoolers specific for the particular workstation.
For large Web based documents that need to be downloaded and printed, and particularly for documents needed on a recurring basis, the combination of user interaction time, download time, possible network congestion, and printer queuing operations, may cause network and user resources to be squandered during times normally associated with peak productivity. Therefore, a need exists in the art for convenient apparatus and accompanying methods for automatically downloading, storing, and printing documents and images from the Web, particularly during intervals of low or idle network, microprocessor, and bandwidth activity. Such apparatus and methods would advantageously free up network and user resources, thus increasing network and worker productivity.