1. Field of the Invention
The present invention relates to a computer system, and deals more particularly with a method, system, and computer readable code for improving stress testing of Web servers. An altered form of client cache is used, enabling more realistic and representative client requests to be issued during the testing process.
2. Description of the Related Art
Use of the Internet and World Wide Web has skyrocketed in recent years. The Internet is a vast collection of computing resources, interconnected as a network, from sites around the world. It is used every day by millions of people. The World Wide Web (referred to herein as the xe2x80x9cWebxe2x80x9d) is that portion of the Internet which uses the HyperText Transfer Protocol (xe2x80x9cHTTPxe2x80x9d) as a protocol for exchanging messages. (Alternatively, the xe2x80x9cHTTPSxe2x80x9d protocol can be used, where this protocol is a security-enhanced version of HTTP.)
A user of the Internet typically accesses and uses the Internet by establishing a network connection through the services of an Internet Service Provider (ISP). An ISP provides computer users the ability to dial a telephone number using their computer modem (or other connection facility, such as satellite transmission), thereby establishing a connection to a remote computer owned or managed by the ISP. This remote computer then makes services available to the user""s computer. Typical services include: providing a search facility to search throughout the interconnected computers of the Internet for items of interest to the user; a browse capability, for displaying information located with the search facility; and an electronic mail facility, with which the user can send and receive mail messages from other computer users.
The user working in a Web environment will have software running on his computer to allow him to create and send requests for information, and to see the results. These functions are typically combined in what is referred to as a xe2x80x9cWeb browserxe2x80x9d, or xe2x80x9cbrowserxe2x80x9d. After the user has created his request using the browser, the request message is sent out into the Internet for processing. The target of the request message is one of the interconnected computers in the Internet network. That computer will receive the message, attempt to find the data satisfying the user""s request, and return the located information to the browser software running on the user""s computer.
This is an example of a client-server model of computing, where the machine at which the user requests information is referred to as the client, and the computer that locates the information and returns it to the client is the server. In the Web environment, the server is referred to as a xe2x80x9cWeb serverxe2x80x9d. The client-server model may be extended to what is referred to as a xe2x80x9cthree-tier architecturexe2x80x9d or a xe2x80x9cmulti-tier architecturexe2x80x9d. An extended architecture of this type places the Web server in an intermediate tier, where the added tier(s) typically represent data repositories of information that may be accessed by the Web server as part of the task of processing the client""s request.
Because Web applications typically have a human user waiting for the response to the client requests, responses from the Web server must be returned very quickly, or the user will become dissatisfied with the service. Usage volumes for a server may be very large: a particular server may receive thousands, or even millions, of client requests in a day""s time. These requests must all be handled with acceptable response times, or the users may switch to a competitor""s application services.
Verifying that a server, and an application that will run on the server, can handle its expected traffic is a normal part of a stress testing process. Stress testing aims to uncover performance problems before a system goes into actual use, and is performed using simulated traffic. In this manner, any performance problems that are detected from the simulated traffic load can be addressed before any xe2x80x9crealxe2x80x9d users are impacted. To maximize the usefulness of the stress testing, the tests that are conducted need to be as realistic as possible. In the Web server environment, this means accurately predicting and simulating the number of requests that must be serviced, the type of requests (and mix of request types) that are received, the number of different clients sending requests, etc.
The requests received at a server typically originate from a client""s browser. (Requests from other sources are outside the scope of the present discussion.) Browsers often make use of a client-side cache, where a local copy of Web documents may be stored after retrieving the document from a server. A browser using the cache checks for a user-requested document in this client-side cache, before requesting it from the server. Browsers implementing the Hypertext Transfer Protocol version 1.1 (xe2x80x9cHTTP/1.1xe2x80x9d) use an expiration mechanism and a validation mechanism with the client-side cache. These mechanisms are described in detail in sections 13.2 (xe2x80x9cExpiration Modelxe2x80x9d) and 13.3 (xe2x80x9cValidation Modelxe2x80x9d) of the HTTP specification, respectively, and are introduced in section 13 (xe2x80x9cCaching in HTTPxe2x80x9d). (The HTTP specification is available on the Web at http://info.internet.isi.edi/in-notes/rfc/files/rfc2068.txt.) The expiration mechanism provides that when an unexpired copy of the document is available in the cache, the response time to the user can be minimized by using this cached copy, thereby avoiding a network round trip to the server. When a copy of a document is in the cache, but it is unclear whether this version remains valid, the validation mechanism provides for reducing the network bandwidth by sending a conditional request to the server. A conditional request identifies the version of a document stored at the client by sending a xe2x80x9ccache validatorxe2x80x9d to the server, which is a value the server uses to determine the validity of the client""s document. If the server determines that this version is still valid, it responds with a short message to indicate this to the client""s browser; the browser will then retrieve the locally-stored copy. Otherwise, when the client""s stored copy is no longer valid, the server responds with a fresh copy. The browser uses this returned copy in response to the user""s request, and will typically store this copy into the client-side cache.
To provide a meaningful stress test of a Web server, it is necessary to simulate the traffic generated by a single browser as realistically as possible, and to simulate a realistic number of browsers, as previously mentioned. A very large number of browsers (perhaps thousands) may need to be simulated for some environments. Typically, a single client machine will be used to simulate multiple browsers, to limit the number of client machines that are required. For each simulated browser, a number of system resources are required on the client machine on which the browser operates. This often implies that trade-offs in the testing are required for system resources which are in limited supply. When caching browsers are simulated, the client-side cache is one such resource. An actual client cache for a single client can consume a very large amount of storage, on the order of hundreds of thousands (or even millions) of bytes. When simulating caching browsers, an upper bound may be placed on the number of simulated browsers in order to limit the cache storage requirements, but this will reduce the effectiveness of the testing. In particular, imposing a limit on the number of client browsers that can be simulated during a stress test may greatly reduce the ability for the test to provide realistic, representative traffic and to therefore provide useful results. As an alternative to limiting the number of simulated browsers, additional storage resources may be added, but this often greatly increases the expense (and perhaps the complexity) of the testing environment.
Several prior art test tool approaches are known, which implement different approaches for dealing with client-side caching browsers. First, an entire cache is sometimes used, in the same manner as an actual cache would be maintained. This approach has the disadvantage of requiring a very large amount of storage (also known as requiring a large xe2x80x9cfootprintxe2x80x9d on the client machine), which necessitates a limit on the number of clients that can be simulated. As just discussed, the results obtained when using this approach for stress testing are not likely to be realistic nor representative. In a second approach, mindful of the need to conserve cache storage, all large object requests are filtered out. For example, image files are a common part of most Web pages, but a single image may require as much as several megabytes of storage. Other types of objects which tend to be large include video clips and sound files. Filtering a request for of these types of objects involves suppressing the request from the browser, thereby conserving the storage space that would have been used to store the response. This has the disadvantage of reducing the realism of the test, because an actual user session would often request these large objects. Thus, the stress applied to the server is unrealistically lowered when these requests are filtered out. A third approach creates a testing scenario by recording an actual client browser session. This recorded session is then played back during testing, to simulate interactions with an actual user. (Typically, the recorded session is replicated to simulate many users.) The disadvantage of this approach is that all browser actions are predetermined, and therefore static: there is no way to introduce conditional messages that exercise the conditional cache validation mechanisms described above unless such messages occurred during the actual session. Even when such conditional messages were recorded, the results of the actual session may cause the recorded session to behave differently with respect to the cache, because the recorded session may see unexpired and still-valid data that was retrieved during the actual session. Thus, the ability to simulate a realistic stress test using HTTP/1.1 caching mechanisms is seriously inhibited.
Accordingly, a need exists for a technique by which these problems in the current test tools for Web servers can be overcome. This technique should enable realistically simulating traffic to stress a Web server, while minimizing the client footprint used in creating the realistic traffic load.
An object of the present invention is to provide a technique to improve stress testing of Web servers.
Another object of the present invention is to provide a technique whereby client-side caching is factored into Web server stress testing.
It is a further object of the present invention to enable Web server stress testing to be more realistic and representative than in the current art, while minimizing the client footprint my required to support the testing environment.
It is another object of the present invention to provide a technique for Web server stress testing that uses conditional document retrieval requests that exercise cache validation mechanisms.
It is yet another object of the present invention to provide this technique in a manner that reduces the required information for cached documents, while still enabling conditional requests to operate.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a system, method, and computer-readable code for use in a computing environment having a connection to a network, for improving stress testing of a Web server. This technique comprises: executing one or more client processes on one or more client machines, each of the client processes communicating with a server under test according to a first networking protocol; creating a meta-cache for each of the client processes during a stress test of the server, the meta-cache replacing a client-side cache used by the client process in an actual network communication to the server using the first protocol; and using the meta-cache during the stress test. Using the meta-cache further comprises: determining whether to send an unconditional request or a conditional request to the server; sending the unconditional request, wherein the unconditional request is generated without using the meta-cache; and sending the conditional request, wherein the conditional request is generated using the meta-cache.
Using the meta-cache step preferably further comprises selecting a target Uniform Resource Locator (URL), and the unconditional and conditional requests are preferably sent using this selected URL. Preferably, determining whether to send an unconditional or conditional request further comprises accessing the meta-cache using the selected URL to determine if a corresponding meta-cache entry exists, and an unconditional request is used when the entry is not located. Optionally, this determination may be a changeable decision as to sending to the unconditional or the conditional request.
Preferably, the meta-cache comprises a plurality of entries, each of the entries comprising: a particular URL identifier for the entry; zero or more entity tag values; an optional last-modified date value; a content-length value; a checksum value; and an optional expiration date value.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.