This invention relates to retrieving information for client processes and, in particular, to using cache memory to service multiple concurrent requests while maintaining boundaries on the delay to service any one client process.
The Internet has experienced phenomenal growth in recent years, in part due to the explosion of e-commerce and its increased use for educational and entertainment purposes. More and more business and consumers are relying on the Internet for information. However, unfortunately, the capacity enhancements of the Internet backbone have not kept up with the Internet""s increased usage and geographical span. The result of this is that users experiencing network delays in accessing websites.
To address this problem, many servers, e.g., proxy servers, have begun to cache documents retrieved from web servers to speed access to their clients for subsequent requests for the document. These servers make a copy of the requested documents received from web servers and service all their local client requests by sending them this copy. A problem occurs when many clients request the same document from a web server, concurrently. How to satisfy these requests without unduly burdening a client is a nontrivial problem. The server servicing the local clients is further hampered by not knowing the size of the document that is being requested. Trying to store a document in the cache while serving it to multiple concurrent requests is difficult. Furthermore, trying to place boundaries on the delays any one client may experience while reducing the overhead processing complicates the matter even more.
Systems and methods have been established in an attempt to address these problems. FIG. 1 depicts a system of downloading a document from a server process and sending it to browser clients. The browser clients 115 and 125 request information from the web server 105. The cache entry 110 is located in a memory to be quickly accessed for a browser client. For example, it may be located in a proxy server""s cache that is servicing multiple local clients including the browser client 115. Only two browser clients and one server are shown, however one skilled in the art will recognize that many browser clients may be desiring the same information from multiple web servers 105.
The Thread 1120 is created by the thread processing component 103 in response to browser client 1115 requesting information, or a document, from the web server 105. A thread contains the instructions necessary to retrieve the information from the web server for a client that thread is associated with. The thread processing component 103 is contained in an interface between browser clients 115, 125 and a web server 105 and has access to a cache entry 110. For example, a proxy server that interfaces with a browser client and a web server may contain the thread processing component, creating a thread in response to a client""s request.
Continuing with FIG. 1, Thread 1120 functions as a producer and consumer thread. As a producer, Thread 1 retrieves the client requested information from the web server 105. When Thread 1120 receives the requested information from the web server 105, it will also place this information in a cache entry 110. As a consumer, it places the web server supplied information in the input stream for it""s associated client, i.e., browser client 1115. Also, when operating as a consumer thread, a thread may retrieve the requested information for a client directly from the web server, or from a secondary source, such as a cache.
Browser client 2125 also requests the same information as requested by browser client 115. However, because browser client 1115 already created a thread to act as a producer and consumer, Thread 2130, created in response to browser client 2""s 125 request, will only function as a consumer. Thread 2130 will retrieve the requested information from the cache entry 110 after it has been placed there by Thread 1120.
In this approach, however, all the information requested, e.g., an entire document, is placed in the cache entry 110 prior to the either of the threads placing, or transmitting, any information to the output streams for their respective clients. While this solution is simple and easy to manage in its approach, this solution suffers in that clients do not receive any data until all the information has been downloaded from the web server. Consequently, there is a large initial delay before any client receives any of the requested information from the web server. This is a problem because many browser clients and/or users may not be aware that the system is working and may terminate a download not knowing that the system is actually processing the downloaded request.
FIG. 2 depicts a system of downloading a document in segments, or data blocks, for client processes. Similar to the FIG. 1 approach, a producer/consumer thread 205 (xe2x80x9cThread 1xe2x80x9d) is created when browser client 1115 makes a request for a document contained at the web server 105. In addition, a consumer thread is created xe2x80x9cThread 2xe2x80x9d 230 for browser client 2125 requesting the same information that browser client 1 requested. However, in this instance, Thread 1205 downloads and stores data retrieved from the web server as smaller data blocks 218 in the cache entry 215 instead of as one complete document. As Thread 1 retrieves a smaller data block, it is placed in a data block 218 in the cache entry 215 and it is sent to browser client 1115 via the client 1 output stream. In addition, when the data block is placed in the cache entry 215, Thread 2 will retrieve the data block from the cache entry 215 and send it to its client 125 via the client 2 output stream.
This approach solves the problem of the large initial delay, indicative of the previous approach, that a browser client suffers before receiving any of the data from the web server. In this case, as soon as a block of data is received from the web server, it is placed in the output stream so the browser client receives an indication, i.e., data from the web server 105, that the request is being processed without having to wait for the entire document to be downloaded.
However the problem with this approach is that the browser client which makes the request dictates the speed at which the cache entry 215 is filled. In FIG. 2, Thread 1205 controls the speed at which information is retrieved from the web server 105. Thread 1205 will not request another data block from the web server until browser client 1115 (associated with Thread 1) receives the previously retrieved data block from the web server. Consequently, if browser client 2125 can retrieve the data block, from the cache, faster than the browser client 1115 can retrieve the data block sent from Thread 1205, browser client 2125 will incur idle time waiting on browser Client 1115 to complete receiving previous data block. Again, this is because Thread 1205 is in control of the downloading information from the web server 105 and Thread 2230 is only acting as a consumer to send the information to its client.
To address this problem of having one consumer thread and consequently one browser client dictate the speed of the download for other browsers, another approach was developed which creates multiple threads to control the downloading from the web server to the cache and another thread to control the downloading from the cache to the browser client. FIG. 3 depicts a system of using multiple threads as both a producer and a consumer.
This approach isolates the activity of a producer thread, receiving the information from the web server, from that of a consumer, receiving the information from the cache to the browser client. In this solution, an independent producer thread 310 is created that is responsible for filling the cache entry 215 with data blocks of information 218.
When browser client 1115 creates a request for information from the web server 105, a consumer thread 305 and a producer thread 310 is created. The producer thread""s 310 responsibility is to fill the cache entry 215 data blocks 218 containing information from the web server 105 requested by the browser client 1115. The consumer thread 305 then places information from the cache entry data blocks 218 into the output stream for browser client 1115. Consumer thread 330 acts similarly to consumer thread 305 to send the requested information to the browser client 2125.
This solution solves the delay problem experienced in FIG. 2, wherein one browser client dictates the speed by which other clients will retrieve data for concurrent requests, by having a producer dedicated to retrieving data from the web server. Generally, though, there is only one request for data from a web server at a time, i.e., only one browser client requesting the same information. Therefore, this approach creates multiple threads for each request even though the vast majority of cases have only one client wanting the information. Creating multiple threads for each request increases threading management and overhead for the system. In addition, when a browser client""s request is canceled, e.g., the client unexpectedly goes offline, prior to receiving the entire document, the producer may continue to download data even though there is no client to receive it, thereby wasting network bandwidth.
Therefore, what is needed in the art is an improved method to retrieve information from a server for multiple clients.
Systems and methods consistent with the present invention satisfy the above need by presenting a method and system to store a document in a cache while allowing it to be served to multiple concurrent client processes. Threads are created that can function as a producer and consumer to retrieve information from a server and send it to multiple clients. While a thread is retrieving information for their client, it functions as a consumer. However, when their client needs additional information that is not contained in cache, the thread will take on the role of a producer to retrieve data from the server. Any thread has the capability to assume the role of a producer at any given time.
Desired characteristics of systems and methods consistent with the present invention include that once the data is obtained from the source, it should be placed locally in memory; subsequent read requests for the resource should be served from the cache; multiple concurrent requests for the same resource should not result in downloading the data from the resource more than once; and finally, a read request should not suffer high initial delay, such as that experienced with the approach discussed with the relationship to FIG. 1. It is desirable to keep the initial delay constant and hence independent of the total length of the resource.
Systems and methods consistent with the present invention have many advantages. First, they minimize thread usage, i.e., no additional threads are created to fill the cache. Specifically, the consumer thread does the job of the producer as well, thus reducing the number of threads created in a common case, such as that discussed in relationship with FIG. 3, by half.
Systems and methods consistent with the present invention also reduce response delay. The user receives intermittent feed back during the downloading thereby reinforcing the user that the system is working at downloading the requested document and preventing the user from disconnecting thinking that the end server is not responding.
Systems and methods consistent with the present invention also have the advantage of synchronizing processing overheads and performing a lazy data fetch. In synchronizing processing overheads, once a buffer in cache is marked complete, the consumer never needs to synchronize to read it. In using a lazy data fetch, data is fetched only on demand.
Systems and methods consistent with the present invention also provide the advantage of minimizing data copying. There is only one data copy from the source to the cache. Subsequently, every consumer does one data copy to its output streams. Therefore, the minimal amount of data copying is achieved.
And yet another advantage is the avoidance of busy waits. Busy waiting happens if a thread is contending for an activity for a shared resource and it does not want to block itself. In this case, the activity is to fill the buffer with data. The busy waiting is prevented by synchronizing the calls to fetch data from the web server. So a consumer is in one of three states, reading data from the buffer, filling the buffer by fetching data from the source, or blocked waiting to fetch data from the source.
In accordance with one aspect of the present invention, as embodied and broadly described herein, a method of retrieving information from a server process for a client process, comprises the steps of creating a first thread associated with a request for information from the server process, using the first thread, receiving a first data block of the requested information from the server process, transmitting the first data block of information to the client process, and transmitting a second data block of information to the client process, wherein the second data block was received from the server process using a second thread. The information received from the server processed may be cached. In addition, the information in the cache may be timestamped and removed from the cache when it exceeds a predetermined cache storage time limit, i.e., a time when it is deemed as too old. The system may receive multiple concurrent requests for information.
In accordance with one aspect of the present invention, as embodied and broadly described herein, a method for retrieving information from a server process for multiple client processes, comprises the steps of creating a plurality of consumer threads each associated with a client process, assigning a first one of the plurality of consumer threads as a producer thread, requesting information from a server process using the producer thread, and transmitting information received from the server process to a client process associated with one of the plurality of consumer threads. The method may also comprise the step of assigning a second one of the plurality of consumer threads as a producer thread. In addition, the method may further comprise the step of storing the information received form the server process in a cache, wherein the step of transmitting the information received from the server process comprises the step of transmitting information from the cache to the client process associated with the first one of the plurality of threads. Furthermore, the step of assigning the producer thread may comprise the steps of selecting the first one of a plurality of consumer threads associated with a client process which received the last data block of information from the cache and assigning the selected consumer thread as the producer thread. And finally, the method may also comprise the step of receiving multiple concurrent requests from the multiple client processes for the information.