With the fast development of the Internet, more and more people obtain rich information resources through the Internet. However, the sharp increase of users inevitably causes problems such as heavy load on the network server, longer latency on the client, and congestion of the backbone network. In particular, with the popularity of streaming applications on the Internet such as video on demand (VOD), long distance education, and e-commerce, these problems become more serious.
The conventional method for solving these problems is to upgrade the network server and increase the network access bandwidth continuously. However, this conventional method cannot solve the problems fundamentally. The reason is as follows: The upgrade speed of the server is generally slower than the increase speed of the number of users; merely increasing the network access bandwidth cannot solve the congestion of the backbone network, and the upgrade of the backbone network costs a lot.
The proxy cache technology can effectively solve the above problems faced by the conventional method. The proxy cache, also known as a proxy server, is a server between the browser and the server or a server between the client and the server, and can provide a large storage space. When a user accesses data, it is checked whether data needed by the user is available in the proxy cache. If the data is available in the proxy cache, the data is directly sent to the user; otherwise, the data is obtained from the server and sent to the user.
The access mechanism of the proxy cache may reduce the count of attempts to access the remote server by the user, increases the response speed of the client directly, and eases the load on the remote server and the congestion of the backbone network indirectly. In addition, because the proxy cache stores a data copy, the user can still obtain information resources from the proxy cache even if the remote server cannot provide services within a certain period of time. This can obviously improve the quality of service of the client.
However, the storage capacity of the proxy cache is limited. Once the storage area is full, some obsolete data is replaced according to a pre-agreed policy to serve the user subsequently. The replacement policy directly affects the performance of the proxy cache. Therefore, the cache replacement policy is an important factor affecting the performance of the proxy cache.
Generally, the performance indicators for measuring a cache replacement policy include a cache hit rate (CHR) (also called hit rate), a byte hit rate (BHR), and a latency time (LT). The CHR is a ratio of the count of hit cache pages to the total number of user requests. The BHR is a ratio of the number of hit bytes in the cache to the total number of bytes requested by the user. The LT refers to the interval from the time when the user initiates an access request to the time when the user receives a response to the request.
Currently, a general cache replacement policy includes a Least Frequently Used (LFU) algorithm, a Greedy Dual Size (GDS) algorithm, a Least Frequently Recently Used (LFRU) algorithm, a Period Least Frequently Used (PLFU) algorithm, and a Least Served Bytes (LSB) algorithm. The performance of the cache replacement policy depends on multiple factors, for example, the user access model, application features of the user access, and cache size. Currently, a cache replacement policy that covers all factors and is superior to other algorithms is not available. In addition, these algorithms do not have adaptability in the case of application changes. For example, for the conventional Web access, the GDS algorithm has better performance; for the VOD application, the LFRU algorithm and the PLFU algorithm have higher efficiency than the conventional LFU algorithm and are more applicable to a large-scale VOD system; the LSB algorithm is more suitable for the peer-to-peer (P2P) download application.
With constant changes of network applications, new requirements are raised for the adaptability of the cache replacement policy. Take the Squid Cache as an example to describe a typical solution in the prior art. The Squid Cache (briefly referred to as the Squid) is a popular open source proxy server and Web cache server, and works by using configuration files. The following is the text of some configuration files of the Squid Cache and includes configuration of the cache replacement policy.
......cache_mem 1228 MBmaximum_object_size 5096 KBmaximum_object_size_in_memory 5096 KBcache_replacement_policy heap LFUDAmemory_replacement_policy heap LRUcache_dir aufs /cache 27970 16 256cache_access_log /log/access.logcache_log /log/cache.log......
The configuration of cache replacement policy is executed by using the following two lines:
cache_replacement_policy heap LFUDAmemory_replacement_policy heap LRU
In the above two lines, the first line is to configure the cache replacement policy of the entire Squid Cache as the LFUDA algorithm, and the second line is to configure the cache replacement policy of the memory as LRU algorithm.
That is, the Squid determines a cache replacement policy through pre-configuration. Once the Squid Cache runs, the cache replacement policy cannot be changed; if the cache replacement policy is changed, the Squid Cache must be restarted to validate the new configuration. The Squid Cache has the following disadvantages: The selection and update of the cache replacement policy require manual intervention. The cache replacement policy is determined, selected, and updated by an administrator. That is, to select a specific algorithm, the administrator must clearly know various factors such as the current user access model and application and also know which cache replacement policy is suitable for the current scenario. This imposes too high requirements on the administrator. When the scenario is changed, for example, when the user access mode is changed or the application is changed, the Squid Cache cannot adapt to the change properly.