The present invention relates to data processing systems, and more particularly to systems and methods for developing and debugging software programs implemented by such systems. During the development of computer software programs, it is typically necessary to continually evaluate and test the program to ensure that it will operate properly under a variety of operational conditions. In conducting such testing (commonly referred to as xe2x80x9cdebuggingxe2x80x9d), it is often desirable to obtain performance information generated during the actual execution of the program. This information may include timing information relating to execution time of various parts of the program being tested as well as information relating to the program""s ability to operate correctly in response to a wide variety of conditions and errors. Preferably, performance information is obtained by monitoring program execution under situations that simulate, as nearly as practical, the actual operating environment in which the software is intended to run. The data gathered during this monitoring (commonly referred to as xe2x80x9ctracingxe2x80x9d) is then analyzed to locate the corresponding places in the code that may be causing or contributing to any identified problems.
In one particular example of software program debugging, it is foreseen that, in operation, a software program should appropriately respond to erroneous inputs or other types of fault sequences such that the error condition is overcome in a timely manner. That is, the program is developed to include various fault paths which are followed in response to various error conditions. In debugging such functionality within a test program, it is desirable to view the operation or behavior of the fault paths when subjected to the error conditions. It can then be determined, through analysis, whether the test program is reacting properly and timely.
Unfortunately, several problems exist in performing the above type of fault path analysis and debugging operations. First, many software programs must be responsive to a wide variety of error conditions, each requiring a different response or fault path on the part of the program. Second, the error conditions being tested are often very rare in their natural occurrence. Nonetheless, the software must be designed to overcome the eventuality that the error conditions will occur. In these circumstances, it is often difficult to reproduce such error conditions in a manner which affords testing of the program yet does not itself impact the operation of the program.
Conventionally, error conditions or sequences are tested by inducing the condition (through external means or otherwise) during operation of the software. For example, in a peer-to-peer system between two networked devices, erroneous signals or messages may be sent to a device under test. However, actually generating each of the conditions to be tested is extremely time intensive, if possible at all. Alternatively, the condition may be xe2x80x9ctestedxe2x80x9d during program development by examining the software code itself or test running portions of code to determine end-user operation.
Unfortunately, in complex systems in which various software components are designed to react timely with each other, code testing alone is unlikely to provide an accurate picture of how the complete system will respond in operation.
In particular, conventional techniques commonly used to debug reactive systems include numerous deficiencies in that such techniques are generally 1) static, and not dynamic, 2) hard-coded, and not variable, and 3) not probabilistic. Typically a developer debugging a reactive system will make a change in the code running a system at one end of a communications link, which change could force an invalid or out-of-context message to be sent at a certain point in a scenario. This approach is useful for observing the behavior of the system receiving the message, but it is only useful in observing single-shot behavior. For example, in one scenario, conventional debugging techniques enable developers to see what the system does when it receives a bad message one time. However, such static techniques fail to enable the developer to see what will happen in multiple executions of the event path. In addition, with conventional methods, developers cannot see what happens when error conditions appears, go away, and then re-appear after some duration of correct behavior. In this circumstance, it is unclear whether the error handing process is robust enough to handle errors which happen infrequently, interspersed with longer periods of correct operation, and continue to do the job for which it was designed.
Accordingly, there is a need in the art of software development techniques for system and method for analyzing fault path behavior of computer software programs.
Further, there is also a need for a method and system for reproducibly injecting fault sequences into a computer software program to determine the responsive fault path behavior.
The present invention overcomes the problems noted above, and provides additional advantages, by providing a system and method for determining fault path behavior in a computer software system. An error or event, the occurrence of which is to be tested, is assigned a probability value and an array of elements populated by pseudo-random numbers. Upon each operation of the system under test the current array value is compared against the probability value. If the current array value is greater than or equal than the probability value, the error or event is simulated within the software. Otherwise, the event is not simulated and the software is left to operate conventionally. The array is incremented upon each occurrence of the system under test.