Many large business-critical web applications experience widely varying load and usage during any given period of time. For example, an online shopping web application may undergo a steady and manageable application workload prior to the airing of a television advertisement announcing a promotion offered through the online shopping web application during the Super Bowl or other highly viewed television event. Soon after the airing, the online shopping application may experience a sharp increase or “burst” in application workload resulting from a large number of users simultaneously accessing the online shopping web application attempting to gain the benefit of the promotion. Extended periods of relatively heavy application workload are also common. For example, the shopping application may undergo a high volume application workload the entire week before Christmas.
Variances in application workload during a period of time can cause defects (i.e., bugs) to manifest in a web application that are difficult to predict beforehand. Prediction is difficult because the defects only readily occur under certain, difficult to replicate, operating conditions such as a high volume or highly concurrent application workload. As such, solutions have been developed to aid application developers and testers in diagnosing and troubleshooting such difficult to predict defects.
In one approach, a real application workload submitted to a production web application in a production environment is captured and replayed against a replica of the web application in a test environment with the goal of reproducing the production environment behavior in the test environment where the reproduced behavior can be analyzed and diagnosed by application developers and testers, perhaps repeatedly. However, with most web applications, this approach in and of itself will not reliably reproduce the order of database data changes observed in the production environment. Using this approach, reliable reproduction of database changes is difficult or impractical because most web applications are affected by non-deterministic factors that are difficult or impractical to control when replaying the captured application workload. Examples of such non-deterministic factors include concurrently executing processes and threads, network latency, hardware timers and interrupts, and thread context switching, among others.
Unreliable reproduction of database changes when replaying a captured application workload presents at least two problems. First, if the occurrence of a defect observed in the production environment depended on the order of database changes in the production environment, then that defect may not be reproduced when the application workload is replayed. The defect may not be reproduced because non-deterministic factors in the test environment may cause the order of database changes to diverge from the order that occurred in the production environment. Second, if the order of database changes in the test environment is allowed to diverge from the order of database changes in the production environment, then an error that did not occur in the production environment may occur in the test environment, potentially even preventing the defect observed in the production environment from being reproduced in the test environment.
As an example of these two problems together, consider a web application for purchasing a seat on an airline flight. Assume that in the production environment, two concurrent application requests (REQ1, REQ2) from two users are made to the web application to reserve the same airline seat on the same flight. Further assume the user (USER1) submitting REQ1 is able to reserve the airline seat while the user (USER2) submitting REQ2 is not able to reserve the seat, the seat being already reserved by USER 1. In response to REQ1, a data change is made to a database to reflect USER1's reservation of the seat. Subsequently USER1 issues another request (REQ3) to the web application cancelling his prior reservation and that as a result of REQ3 an unexpected defect occurs in the web application.
Captured requests REQ1, REQ2, and REQ3 are then replayed in the test environment for the purpose of reproducing the defect. In the test environment, when REQ1 and REQ2 are concurrently replayed, it may be that because of non-deterministic factors in the test environment REQ2 is able to reserve the airline seat for USER2 while REQ1 is not able to reserve the seat for USER 1 even though REQ1 was able to reserve the seat for USER1 in the production environment. In response to REQ2 in the test environment, a data change is made to a database to reflect USER2's reservation of the seat. When REQ3 is replayed in the test environment, it may fail not because of the unexpected defect that caused REQ3 to fail in the production environment, but because REQ3 in the test environment is attempting to cancel the reservation for USER1 that does not exist in the test environment database. Thus, by not reliably reproducing database changes in the test environment, an error can occur in the test environment that did not occur in the production environment that masks the “true” defect. Consequently, this approach is less than optimal.
In another approach, an application workload representative of a real application workload is captured. The captured representative application workload is replicated as necessary to create a “synthetic” application workload that approximates the volume and concurrency of the real application workload. For example, the representative application workload might comprise a number of requests made by a single user to a web application in a production environment. A synthetic workload approximating a real application workload comprising N concurrent users may be created by replicating the captured requests N times. However, because of non-deterministic factors in the web application, creating a synthetic application workload that can reliably and faithfully reproduce database changes caused by the real application workload may be impractical. Further, a human user is typically required to design the synthetic application workload. At best, this approach is time consuming, expensive, and error-prone.
As a result of these disadvantages, existing testing systems do not reliably reproduce a real application workload and do not scale well. Accordingly, a better solution is sought.