Testing, monitoring, automation, and other web client simulation tools, consecutively called simulation tools, often use a recorder to automatically generate scripts that simulate user activity for repeated replay. For example, load testing tools can simulate large numbers of users using multiple technologies and allow businesses to predict the behavior of their e-business web application before it is introduced for actual use, regardless of the size and complexity of the application. A reliable load testing tool can simulate the activities of real users of various profiles who use web clients such as web browsers to access the web server. The load testing tool can simulate the activities of a maximum expected number of users, maintain a steady number of users over a certain period of time, or stress a single component of the web application. The load testing tool measures performance, reliability, accuracy and scalability and as such can reveal a wide range of possible errors. The load testing tool simulates real users by using virtual users that effect client-server interaction on the protocol level, not the graphical user interface (GUI) level. By performing a load test, businesses can test the performance, scalability, and reliability of the web application.
A load testing tool for web applications includes a controller, multiple agents, and multiple virtual users. The controller manages and monitors the load test. The agents are connected to the controller and run on the same computer as the controller or on remote computers. Each agent hosts multiple virtual users. Each virtual user makes requests to and receives responses back from the web application on the web server. The request contains the URL of the document to be retrieved from the server, and the response contains the document itself and information about the type of document. The document can be, for example, an HTML document, an image, a PDF™ file, a JavaScript™ file, or a plain text file.
A simulation tool incorporates a replay engine which simulates virtual users 608 with respect to the network traffic they generate 609, as shown in FIG. 6. A script 604 is a series of instructions which are input to the replay engine and can be in any format the replay engine understands (e.g., textual script written in a programming language, instructions stored in a database, or XML). A script can be written using a software tool like a text editor.
However, the most convenient way of generating a script is to use a recorder 602 which generates scripts 604 based on the HTTP(S) transactions 601 resulting from a real use using the web application. The recorder records into one or more scripts the actions of the real user such as clicking on hyperlinks, submitting forms, transitioning back and forth in the session history (e.g. using a web browser back button), and pausing between activity.
The recorded scripts are then replayed simultaneously to simulate the user interactions of many users in order to test the server. The scripts for each virtual user are executed concurrently by the replay engine to generate the desired load on the test environment. Each component of the test environment can be monitored in real-time during the load test replay to allow the testers to view the web application's performance at any time. The results of the load test replay can be analyzed to improve weaknesses and address errors in the web application. The scripts can be rerun to verify the adjustments.
Real users interact with a web application using a client program on a computer 101 that calls upon services provided by a server program as shown in FIG. 1. The client program can be a web browser that requests services from a web server 102, which in turn responds to the request by providing the services. This interaction is in the form of HTTP requests 103 and responses 104.
Virtual users do not interact with the web application using a client program since this would involve running a client program for each individual virtual user. Running separate client programs for each virtual user is impractical for a load test because it wastes resources. Instead, the interactions of each virtual user with the web server take place on the protocol level. Therefore, in conventional load testing tools, the scripts display user interactions in the form of HTTP requests. The recorder records network activity between the client application and the server. Recorders used in load testing do not record activity within the client application such as the movement of each user's mouse or the specific keystrokes that each user performs.
A web page, as shown in FIG. 2, includes one or more documents received from the web server. Each document is received by sending one HTTP request (or more than one request when there are redirections). A web page forms a tree which has a root document 201 and leaves. A root document is a document that can have one or more embedded sub-documents 202, 203 such as HTML documents, images, embedded objects, frames, scripts, applets, and style sheets. Leaves 204 are documents, such as images, style sheets, and plain text, that cannot have sub-documents. The visual representation of a web page is exactly what the real user sees when using the web browser. Documents can also contain hyperlinks 205 and forms 206, which are not separate documents. A user can click on a hyperlink to transition to another web page having the URL associated with the hyperlink. Forms can be completed and submitted to the server by the user, thereby causing a transition to another web page having the URL associated with the form. Web pages can also include executable code that is executed by the web browser (e.g. JavaScript™, VBScript™, and Java™ applets). The executable code can be embedded in HTML documents or contained in sub-documents.
FIG. 5 shows a session history, which is a sequence of web pages that are downloaded or retrieved from a client side cache. The interaction of the user with the web application is called a user session. The session begins when the user starts the browser and navigates to the web application. The session ends when the user closes the browser or navigates to a different web application. The URL of the first page 501 is specified by the user, e.g., by entering the URL in the address bar of the browser 511 or by clicking on a “bookmark”. Each successive page 502, 503, 504, 505 is the result of a page transition from one page to the next page. A page transition can be caused by the user clicking a hyperlink 512, the user filling in and submitting a form 513, by the user navigating in the browser's page history by clicking the “back” or “forward” button 514, or by the browser executing client side code 515.
Although the HTTP protocol is a stateless protocol, complex web applications need to be able to relate a single request to one or more preceding requests. Such requests are common in web applications that use session IDs, shopping carts, user IDs, and other forms of state information. These forms of state information can be in the form of unique strings of characters that are placed in requests or responses, as well as the transferred documents such as HTML. The unique string appears hard-coded in each related request and response. Real clients such as web browsers can correctly identify state information in responses and can also correctly embed such state information in subsequent requests.
For example, a session ID is a unique string of characters that a web application assigns to all responses within a period of time in which a client and a server interact. The client then returns the session ID within a request to the server so that the web application can determine which client sent the request. A shopping cart is commonly used in e-business applications to handle catalogs of items that an individual user would like to purchase. User IDs are assigned by a web application to identify a particular user. If a simulation tool cannot correctly identify a session ID, a shopping cart ID, a user ID, or other forms of state information within a response and transfer it back to the server in subsequent requests, the simulation tool does not correctly simulate real clients. This may lead to invalid test results or even errors which in fact are not errors of the web application but rather an artifact of the simulation tool being unable to simulate real clients properly.
Conventional load testing tools can only handle standardized state information called cookies. All other forms of state information can, if at all, only be handled by manually customizing the script.
Context management is the ability of a testing tool to: a) manage state information during replay by dynamically analyzing content for state information, even when the content is generated dynamically and contains state information; b) properly transfer this state information back and forth in requests and responses; c) restructure received documents (one or more HTTP(S) transactions) into web pages; and d) maintain a session history to allow virtual users to transition back and forth between web pages in the session history.
Missing or poor context management of a simulation tool can result in the mishandling of state information which ultimately can lead to a failed, unreliable, and ineffectual load test. For example, when multiple virtual users log into the same account or session due to incorrect state management, the load on the web application does not correctly model the real intended behavior of the live application. Accurate simulation/testing tools would create one session per virtual user. Poor context management of a load testing application, in particular, can lead to inaccurate test results or failed tests. Missing automatic context management by the simulation tool must be compensated by large efforts to customize scripts manually for correct state management, thereby resulting in high costs and lost revenue.
State management may be achieved using state information which can be included as a unique string in cookies, URLs of links or embedded objects, or form fields. The string acts as a session ID or other ID such as an encryption ID or a serialized object reference to an object residing at the server which is placed in a hidden field. The string allows the server to identify the unique web session in which the original request originated or other state information, and the string is returned by the browser to the server as part of any subsequent request, thus relating the requests.
In a simulation tool without state management, the hard-coded session ID is sent to the server when replaying the script. However, since the specific session ID does not correctly identify the replayed session, this replay does not run correctly. The session ID only identifies the session ID of the recorded session and cannot be used again for the replayed session. Therefore, a script that uses such state information is unsuitable for a proper load test, and the web application will most likely generate errors during replay since sessions are usually terminated by the web server after a predetermined period of time. Load testing tools without state management generate scripts that must thus be customized, manually or via a tool, to handle state information such as session IDs in web applications in order to avoid such problems.
Conventional simulation tools use a HTTP-level replay or a page-level replay to execute scripts. A low-level, HTTP-level replay executes script instructions that specify single HTTP transactions. The information for each HTTP transaction, such as the URL and form data, is specified in the script. Because of this, session information, which is part of these URLs, and form data are hard-coded in the script and are not dynamically parsed out of documents during runtime. A HTTP-level API is therefore not suited for automatic state management and does not provide context management as well.
In contrast, a conventional simulation tool with a page-level replay executes script instructions that specify complete web pages that can include various images and frames to be downloaded. The replay engine uses a document parser (e.g. HTML parser) to obtain the URLs referencing embedded objects and frames at replay time. Since downloading a single web page using the page-level API automatically initiates HTTP requests for also downloading embedded objects or frames, the page-level script is typically shorter than the HTTP-level script. The page-level script can also automatically obtain state information which may be contained in URLs referencing frames and embedded objects in real time. Such state information is contained in URLs embedded in HTML tags of HTML documents and parsed in real time, but is not hard-coded in the page-level script.
The present invention includes a recorder which is able to identify the context of web page transitions by analyzing the HTTP(S) network traffic even for web applications that use client side code execution. Conventional recorders can also record context-full page-level scripts for web applications without client side code, but fail in many cases to do so if the web application uses client side code.
Both a web browser and the replay engine of a conventional simulation tool that is capable of a page-level replay follow the same steps when downloading a web page from a web server. Documents are downloaded (step 401) by the client from the server as shown in FIG. 4, starting with the root document. If a downloaded document is parsable (step 402) (e.g. an HTML document), a standard HTML parser parses the document (step 404). A standard HTML parser parses each HTML document as it is received by the client from the web server so that embedded objects and frames can be detected and downloaded automatically. Since the conventional page-level API function call requests all embedded objects and frames in the root document, the standard HTML parser specifically looks for these embedded objects and frames in the HTML document. The standard HTML parser also looks for hyperlinks and forms in the HTML document so that they can be referenced in subsequent API calls when transitioning between web pages.
As shown in FIG. 3, the standard HTML parser 302 parses an HTML document 301 and can output a list of frames 311, a list of embedded documents 312, a list of hyperlinks 313, and a list of forms 314, each with their associated URLs. Each embedded object and frame is downloaded following the same procedure of FIG. 4 in a recursive way. A document that is not parsable stops the recursion (step 403).
The conventional recorder uses the HTML parser to analyze the HTML documents wherein a single HTML document is retrieved by one (or more when there are redirections) HTTP requests. Conventional recorders save some state information hard-coded in the API call parameters in the script.
Page-level replay instructions can be “context-full” or “context-less”, depending on how a page transition is specified by the replay instruction. A “context-less” replay instruction can be executed by the replay engine using information only specified in the script. A context-less replay instruction includes the URL of the root document to download and optional replay instructions that send form data without referring to a form contained in a previously downloaded web page. Both the URL and all form field names and values are specified in the script, possibly containing hard-coded state information.
The term “context-less” refers to the fact that the replay engine can execute such a replay instruction without using the context of the user session executed so far. No references to previously downloaded web pages in the user session exist, and no dynamic data from previously downloaded web pages is used.
A “context-full” replay instruction is a replay instruction which refers to a previously downloaded page. The term “context-full” refers to the fact that the replay engine can execute the replay instruction only within the context of the replay session up to the point where the context-full replay instruction is executed. Without a session history, the replay engine would not be able to collect the data needed to perform a context-full replay instruction.
The replay engine identifies the HTML document and all of the information associated with the particular HTML document such as the embedded images and frames that form the HTML document. All of the information is downloaded in real-time during the replay. There are no session IDs hard-coded in the script.
For example, a replay instruction can download a web page by following a hyperlink contained in the previously downloaded web page. To download the web page, the replay engine uses the document parser to obtain the URL that is associated with that hyperlink. The URL associated with the hyperlink is obtained in real-time during the replay. The script only contains the reference to the hyperlink, but not the URL, which might contain state information. A reference to a hyperlink may consist of several items, e.g., name of the hyperlink, ordinal number, name of containing frame, reference to the containing web page. These types of references typically do not contain state information.
In another example, a replay instruction submits a form which is contained in a previously downloaded web page, given a reference to that form. The replay engine uses the document parser to obtain the URL associated with the form, and the names and initial values of the form fields. The script only contains the name of the form and the values of form fields that are different from the original values (i.e. the values that have been edited by the user). The script does not contain the URL, which may contain state information. The script also does not contain the values of hidden or otherwise unchanged form fields, which also may contain state information. A reference to a form may consist of several items, e.g., name of the form, ordinal number, name of containing, frame, reference to the containing web page. These types of references typically do not contain state information.
The page-level script can eliminate many of the URLs that contain state information since usually the URL for the first HTML document is the only URL hard-coded in the script. The URL typically corresponds to the first page specified by the virtual user and typically does not contain state information since it is the entry point to the web application. The other links are obtained dynamically during the replay as URLs which are obtained by parsing the downloaded HTML document during replay. These links correspond to context-full replay instructions in the script. By obtaining information dynamically, the use of hard-coded session IDs in the script can be avoided.
The capability of the replay engine for automatic state management by means of executing context-full replay instructions is called automatic context management. A page-level replay is suited for automatic context management for web applications that do not use client side code execution, i.e., each page transition is implemented by standard hyperlinks and form submissions.
The term “client side code execution” refers to code executed within the web browser by the client such as JavaScript™ code embedded in HTML documents or in separate JavaScript™ documents, Java™ applets, VBScript™, ActiveX™ controls, or browser plug-ins.
Code executed on the client side may dynamically assemble URLs and forms and cause page transitions using these URLs and forms. Such page transitions cannot be modeled by context-full replay instructions within a traditional page-level replay, because the URLs and forms may not correspond to any hyperlink or form contained in any previously downloaded web page.
The standard page level recorder/replay is sufficient for web applications that interact with client programs using standard HTML documents. However, a standard page level recorder/replay using a standard HTML parser is unable to parse code executed on the client side since it is unable to recognize URLs in the client side code. These URLs are recorded into the script, and this leads to errors during replay if these URLs contain hard-coded state information.
For a successful record/replay, the virtual users simulated by the testing tool need to behave similarly to a web browser. For instance, a real user downloads a JavaScript™ document referenced by an HTML document and executes the JavaScript™ when the page is viewed in the web browser. The JavaScript™ code can generate a direct HTTP request to the server. However, URLs included in JavaScript™ code that is executed on the client side are not recognized by a standard HTML parser since they do not appear as standard links, frames, forms, and embedded objects. If there is a URL coded in the JavaScript™, that URL is not recognized by a standard HTML parser so that the Page Level API could use this parsed URL (which might contain state information) for a contextful page transition or automatic load of an embedded object or frame. It is instead recorded as is, leading to context errors during replay.
The inability of the standard HTML parser to parse code such as JavaScript or applets makes it difficult to model accurately the interactions between the web application and the virtual users in a page-level API. Recorders of typical simulation tools may not be able to properly record context-full functions when code is executed on the client side and will record context-less functions instead.
The recorder cannot determine the context of an API function when it observes an HTTP request that cannot be interpreted as an embedded object, a frame, a link, a form, or any other entity that the standard HTML parser can detect. The standard HTML parser does not have the ability to identify the URLs generated during client side code execution. Therefore, the recorder records context-less function calls corresponding to the non-interpreted HTTP request.
The standard HTML parser cannot recognize a URL that is generated by executing client side code and therefore cannot associate the URL of the recorded HTTP transaction with any URL in the session history. Therefore, the recorder records a hard-coded URL in the API function call.
The scripts that are generated by a recorder using a standard HTML parser produce an unreliable load test replay when the replay engine executes a context-less replay instruction using hard-coded state information. The standard HTML parser is only able to identify URLs found in HTML tags such as hyperlinks and references to frames and embedded objects.
Client-side code causes actions that are performed by the client, rather than by executing web application code on the server. Code executed on the client side gives web developers increased functionality over standard HTML techniques and provides the client side portion of the web application with abilities such as loading a web document, loading embedded objects, and modifying HTML code and form values.
Conventional load testing tools exhibit poor context management when the conventional page-level recorder generates context-less function calls after code is executed on the client side. When the replay engine executes the context-less function call, the state information contained in the context-less function call may generate errors in the replay. These errors are generally unrelated to the performance of the web application under a load test, thereby leading to failed or unreliable test results.
State information can be handled properly by driving a web browser during the load test for each individual virtual user. A scalable, high-performance load testing tool does not drive Web browsers and as such does not execute client side code since this wastes resources. Instead, a virtual user accesses a web page on the protocol level during a load test. Running separate client programs for each virtual user is also unnecessary since a load testing tool is intended to test the server and not the client. The execution of application code for each virtual user during the load test has a heavy impact on scalability and performance measurement accuracy. Therefore, execution of client side code is especially not an option for a simulation tool that aims to simulate thousands of virtual users on a single agent machine.
Conventional load testing took also exhibit poor context management when handling hidden form fields and form fields that are modified, added, or removed using code executed on the client side. Web applications commonly use hidden form fields to transfer session information when users complete and submit forms to the web application. Also, JavaScript™ and applets are commonly used to modify, add, and remove form fields. The standard HTML parser overlooks state information which is carried or generated by JavaScript or applets. Since the standard HTML parser cannot identify dynamically adjusted hidden form field information in the HTML document without executing the client side code, the session information commonly found in hidden form fields cannot be removed from scripts and subsequently obtained dynamically in the replay. Instead, the state information remains as hard-coded information in the recorded scripts in context-less API function calls.
As shown in FIG. 13, the recorder processes a HTTP transaction by inspecting the transaction (step 1301) using the session history 1302 to identify the role of the HTTP transaction within the session history.
If the HTTP transaction corresponds to an embedded object (step 1311), no replay instruction is added to the script. The session history is updated to reflect the fact that this embedded object has been downloaded.
If the HTTP transaction corresponds to a frame (step 1312), no replay instruction is added to the script. The session history is updated to reflect the fact that this frame has been downloaded.
If the HTTP transaction corresponds to a hyperlink (step 1313), a new context-full script instruction “FollowHyperlink” is recorded in the script along with parameters that allow the replay engine to reference the hyperlink during replay. A new web page is added to the session history resulting from following the hyperlink. However, this new web page in the session history is incomplete and will be populated with embedded documents and frames as the recorder processes the upcoming HTTP transactions.
If the HTTP transaction corresponds to a form submission of an existing form (step 1314), a new context-full script instruction “SubmitForm” is recorded in the script along with parameters that allow the replay engine to reference the form during replay and the names and values of the form fields that have been edited by the user. A new web page is added to the session history resulting from the form submission. However, this new web page in the session history is incomplete and will be populated with embedded documents and frames as the recorder processes the upcoming HTTP transactions.
If the HTTP transaction corresponds to neither a hyperlink nor a fowl submission, but contains form data (step 1315), a new context-less script instruction “SendForm” is recorded in the script along with the complete form data that is to be used for sending the form during script replay, including the URL to use and the complete list of names and values of the form fields. A new web page is added to the session history resulting from the form submission. However, this new web page in the session history is incomplete and will be populated with embedded documents and frames as the recorder processes the upcoming HTTP transactions.
If the HTTP transaction corresponds to neither a hyperlink nor a form submission and does not contain form data (step 1316), a new context-less script instruction “LoadPage” is recorded in the script, along with the URL that is to be used for loading the page during script replay. A new web page is added to the session history representing the new web page. However, this new web page in the session history is incomplete and will be populated with embedded documents and frames as the recorder processes the upcoming HTTP transactions.
If the HTTP transaction corresponds to neither a hyperlink nor a form submission (steps 1315 and 1316), the recorder records context-less script instructions to the script. The way the conventional recorder handles these cases is the reason why hard-coded state information is incorporated in recorded scripts.
In a conventional recorder, each document in the session history is inspected to find forms that exactly match the form being submitted in order to find a form that can be used for a context-full SubmitForm replay instruction. Forms match exactly if the action URL (i.e., the URL that defines where to send the data in the submitted form when the submit button is clicked or a similar action is performed) is identical, and the form being submitted contains all form fields of the form from the document in the session history.
However, the form being submitted may contain additional form fields not present in the form in the document in the session history. For instance, web browsers implicitly add the form fields “x” and “y” which contain the coordinates of the mouse click. Additionally, the form field values in the form being submitted may be different from the form field values in the form in the document in the session history because the user may have edited form field values, but the for his are considered identical since the form being submitted contains all form fields of the form from the document in the session history.
Only if an identical form is found, will the conventional recorder be able to record a context-full replay instruction such as “SubmitForm” with a reference to the form from the session history. If no such form is found in the session history, a conventional recorder records a contextless replay instruction such as “SendForm”, which also requires recording in the script a complete specification of the form without using any dynamic information.
Scripts can be customized by the tester after recording the script, manually or by using a software tool. However, this method of context management is complex, error-prone, and time-consuming and wastes quality assurance (QA) resources.