Software, such as programs or applications, must be tested after each substantial revision to determine if the changes in the new version might have detrimentally affected the operation of the software due to unanticipated conflicts or errors. Software testers utilize a number of testing tools to evaluate the performance of new versions and to identify the source of any problems they may find. Examples of commonly used testing tools include a benchmarking tool (a tool that can determine the time durations of a specific action, call, or other identifiable point in the execution of a software application occurs), a profiling tool (a tool that identifies the sequence of the calls or application programming interfaces (APIs) that are being used, what functions are being called, and when they are used during execution of a software application), a working set tool (a tool that tracks various measurements related to memory, disk cpu usage during execution of a software application), a communication tracking tools (like remote procedure call, or RPC, Bytes over the Wire”—a tool that counts the number of bytes transmitted over the network, etc), a file monitoring tool (a tool that tracks the creation and use of files during the execution of a software application), a registry monitor (a tool that records what registry components were accessed) and a network latency tool (a tool that measures the time it takes for communication between a sender and receiver over the network). In addition to these commonly used tools, a software tester may also use other tools that are specific to the software being tested.
Testing software is a tedious process that must be repeated after each revision. Oftentimes, performance testing starts with a benchmarking test. If the results of the benchmarking test indicates that performance of the software is not as anticipated, then additional software tests are typically performed, this time with one or more testing tools until the source of the problem is identified so that the problem can be corrected.
Each test of the software requires the tester to develop a test scenario in which the tester identifies each testing tool or tools to be used, what data each tool will track, and what operational scenario the software should perform. For example, take the situation in which a software tester finds that a new version of a word processor application fails a benchmarking test indicating that the combined operation of saving a document and closing the application is too slow. Next the tester will have to develop a second test, in which the time the “save” command is received, the time the document is saved and the time the application closes are recorded. After these times are recorded, a third test may be developed and run to that identify what calls are being made and the time duration of each call. This test will identify the call that is the source of the delay. After the delaying call is identified, additional tests focused on the call must be developed in order to ultimately diagnose the source of the delay.
To be used, testing tools typically require the tester to write a test script that identifies the software to be tested, the scenario to be tested, and what tests to perform. More importantly, the test script must also configure the testing computer to support the particular testing tool required, register and execute the testing tool and, if necessary, identify the exact measurements desired. In order to do this, a tester must have knowledge of the specific configuration requirements for each testing tool and the testing computer. The tester must also understand the specific script commands necessary for each testing tool.
The test script is then executed and the test data is saved or printed to the screen as directed in the script. In the above example, the test script for the second test would have been written that identified the application and included the “Save and Exit” command. The script would also have included a command to initialize the benchmarking testing tool and commands to record the time at the save call, the receipt of save confirmation, and at the application close.
In addition to the commands written in a test script (e.g., initialize a testing tool, run a software application and execute a specified series of actions, record the time of an action, or identify the active objects during a specific period), a test will include “markers” that identify points during the execution of the software application. Markers are typically used as part of a command that directs a testing tool to take a measurement, or begin or end taking measurements. There are two types of markers: code markers and script markers.
To assist testers, some software applications are written with testing in mind and include embedded “code markers” that a tester can use when analyzing performance. For example, common benchmarking points in a software such as “receipt of document save command”, “receipt of print command”, “receipt of close command”, “receipt of document open command”, as well as internal status code markers such as “document saved”, “print complete”, “document closed”, etc. may be provided in a software application to facilitate testing. Code markers may be used in a test script to direct the measurements taken by a testing tool, but only if the script and testing tool are designed to take advantage of them. Code markers are very helpful as they provide very exact feedback from known points in the software application itself. However, these code markers are typically sparingly used as they remain in the software after distribution and increase the amount of code that must be stored.
Script markers are markers that can be placed in the script and that identify points during the execution of an application without using internally embedded markers. Script markers are not as precise as code markers as there may be a delay between an actual occurrence of an action and the detection of the action by the testing tool. However, script markers are very useful in situations where there is no code marker or in situations where a scenario is being tested involving the interaction between two software applications which have the same code markers.
The drawbacks of the current software testing methods are many. First, testing is very time consuming because current methods are iterative and, oftentimes each iteration involves the development and execution of a separate script. In addition, a test script must be written for each metrics (like benchmark, working set etc. . . . ) by the tester and executed separately and each time the test script is executed the software being tested is executed. Another drawback is that the test results must be correlated and evaluated by hand to determine where the problem is. Another drawback is that if the testing tool is not written to utilize code markers, then only script markers may be used. Yet another drawback is that if the problem being diagnosed is intermittent, then one or more of the separate executions may not exhibit the problem. A similar problem occurs if the underlying execution framework changes behavior (i.e., is slower in execution of the application) for external reasons between different tests, which can possibly lead to a misdiagnosis of the problem.