Field of the Invention
The invention relates generally to the field of networking and in particular to the field of using auxiliary storage systems such as disk drives as caches for performance improvements in networks.
As more users and more websites are added to the World Wide Web on the Internet, the content of the information transmitted on it also increases in complexity and quantity: Motion video, more complex graphics, audio transmissions, and so on, place rapidly increasing performance demands on the Internet at all points. The problem faced by service and content providers as well as users is how to maintain or improve performance for a growing user base without constantly creating the need for additional capacity or xe2x80x9cbandwidthxe2x80x9d in the network.
Websites and web browser software, such as provided by Netscape Communications Corporation (having a principal place of business in Mountain View, Calif.) on the World Wide Web (WWW)use storage systems such as magnetic disks to store data being sent and received, and most of these also use a simple form of disk caching at the website or at the user site to improve performance and minimize re-transmissions of the same data. These typically use a xe2x80x9cleast recently usedxe2x80x9d (LRU) algorithm to maintain the most recently referred to data in the disk cache and a protocol that permits a user to request that a page be refreshed even if it is in the cache. However, as the traffic continues to grow, this method needs to be improved upon to provide the performance that may be required.
Traffic increases as subsequent requests are made for web pages that had been sent earlier, but are no longer in the local user""s system. The same re-transmission will occur at other points in the network, thus degrading overall response time and requiring additional network bandwidth. One approach that is frequently used to tackle the problem is the use of faster transmission media to increase bandwidth. This takes large capital and labor expense to install and may also require replacement of modems and other equipment at various nodes. Service providers that install faster transmission equipment must still match the speeds at which their users can send and receive data, thus bottlenecks can still occur that slow down performance and response times at the user""s site.
Users who upgrade to faster transmission media may often have to scrap modems and other units that were limited to slower speeds. Somewhat less frequently, large-scale internal network wiring changes may need to be made, as well, often causing disruptions to service when problems are found during and after installation. With any of these changes, software changes may also be required at the user""s site, to support the new hardware.
Despite the users"" best efforts, a well-known phenomenon in network systems design, called the xe2x80x9cturnpikexe2x80x9d effect, may continually occur as users upgrade to faster transmission is media. As United States interstate highway builders first observed in the 1950""s, when better, xe2x80x9cfasterxe2x80x9d highways were made available, more people tended to use them than were initially anticipated. A highway might have been designed to handle a specific amount of traffic, based on then present patterns and data. But once people learned how much faster and smoother travel on the new highway was, traffic might increase to two or three times the original projections, making the highway nearly obsolete almost at the outset of its planned life.
Similar problems occur with users of the Internet and service and content providers. Many of the service providers and online system services have had difficulty adding systems and transmission links to keep up with such increases in traffic. As technology improves in all areas, content providers are providing more graphics, videos and interactive features that impose major new loads on the existing transmission systems. As companies and institutions install or expand local and wide area networks for their internal use, they are also linking them to Internet providers and sites, usually through gateways with xe2x80x9cfirewallsxe2x80x9d to prevent unauthorized access to their internal networks. As these companies link their internal networks to the Internet and other external networks, usage and traffic on the Internet increases multi-fold. Many of these same companies and institutions are also content providers, offering websites of their own to others.
The content providers add to the problem of increased traffic in yet another way, when time-sensitive data is stored and transmitted. Stock quotes, for example, during the hours when a given exchange is open, are highly time sensitive. Web pages containing them or other market information need to be updated frequently during trading hours. Users who are tracking such quotes, often want to insure that they have the latest update of the web page. If standard Least Recently Used (LRU) caching algorithms are used at the user site and this web page is in constant use, the cached copies may not be refreshed for several cycles of stock price changes: Here, caching data works to the user""s disadvantage.
However, once that exchange closes, there should be no updates until the following business day. For the high-volume, high-visibility exchanges, this means traffic can reach peaks of congestion during trading hours. The network capacity used to keep up with this may lie dormant during off-peak hours. Most existing service and content providers on the Internet do not, at present, have an effective way to differentiate between these service levels in their prices or service offerings.
Private dial-up services, such as WESTLAW(copyright) of West Licensing Corporation or LEXIS/NEXIS(copyright) of Reed Elsevier or COMPUSERVE(copyright) of CompuServe, Incorporated or AMERICA ONLINE(copyright) (AOL(copyright)) of America Online, Incorporated, have been able to offer differentiated pricing for networked access to certain kinds of data in their proprietary databases, but doing this is greatly simplified when the choices are limited and relatively few in number. In most cases this is done on the basis of connect time and perhaps some additional fee per database accessed.
Data management methods, such as least recently used caching, can be applied to proprietary databases as well. Usually only one form of data or cache management is associated with a database, and the choice of a particular method of data and cache management has historically been based on the type of file being created.
On the Internet, by contrast, data requests can come from anywhere in the world for almost any topic in the world, to any content provider in the world. Patterns of access and timeliness requirements vary greatly from user to user. An educational institution that provides Internet services to its students and faculty will have one set of needs for access, and response times, while a business corporation user may have a completely different set of needs.
Access to data on the Internet also differs from dial-up access to proprietary databases in another way. The private dial-up service provider may not change the services offered for months or even years at a time. Data files may be updated, but the kinds of information that can be obtained may remain constant.
On the Internet, the opposite is true. Information that was not available three months ago anywhere in the world may now be available from several different sources. This is also true for the format of the information. In less than a three year time span, web pages have gone from text only, to text plus drawings, then to text plus high-resolution photographic-like images in several different formats. Sound is also available now from many sites. Web browsers now permit use of videos and interactive forms. Traditional network and data management techniques are hardpressed to keep up with-these changes.
It is an object of the present invention to provide a method and apparatus for improving network response time at one or more sites or nodes while reducing the amount of bandwidth used to carry a given load.
Another object of the present invention is providing improvements in network response time without requiring any changes in transmission media and transmission equipment.
Still another object of the present invention is providing a flexible method and apparatus for providing response time improvements that can readily be adjusted to different usage patterns.
A further object of the present invention is providing a method and apparatus that permits a service or content provider to offer differentiated levels of service and prices based on the type of data being transmitted.
These and other objects are achieved by a network accelerator storage caching system that may be inserted at any point in a network, to provide a configurable, scalable variety of cache management systems to improve response time. Depending on the configuration(s) selected, the system may manage data or subsets of data in a storage cache on the basis of time-currency, page usage frequency, charging considerations, pre-fetching algorithms, data-usage patterns, store-through methods for updated pages, least recently used method, B-tree algorithms, or indexing techniques including named element ordering, among others. A preferred embodiment may embed the configurable cache management in the storage media, either as firmware in a storage controller or as software executing in a central processing unit (CPU) in a storage controller. In a preferred embodiment the system may be scaled in size and offer security for protected data.
It is an aspect of the present invention to provide improvements in response times.
It is another aspect of the present invention to reduce the bandwidth required in the vicinity of the invention to transmit information responsively.
Another aspect of the present invention is to enable configuring at each site to use the cache method(s) preferred by that site.
A further aspect of the present invention is allowing a site to trade storage space for transmission capacity or bandwidth.