Definitions
“compliance assessment”. (Defined in the text below.)
“compliance assessment protocol”. (Defined in the text below.)
“compliance verification”. Determination of the level of correspondence between the requirements expressed in a system specification and the performance of a DUT.
“deterministically specified system”. (Defined in the text below.)
“directive”. The communication of a request or command, and any other communication made for the purpose of causing the occurrence of an intended result. Those skilled in the art are familiar with a variety of ways of implementing directives, such as function calls.
“DUT”, or “device under test”. An implementation of a system that is to undergo verification.
“DUT computation”. Computation of DUT behavior (e.g. simulation of a model of a DUT) during dynamic system verification.
“dynamic verification run”. A single session of testing, that is, applying a compliance assessment protocol to a DUT.
“dynamic verification suite”. The set of dynamic verification runs used in compliance verification.
“interface”. Specified behavior made accessible by a system specification. This can include specialized behavior made available solely during verification. The totality of the behavior exhibited by a system is its interface, but, for convenience in explanation, it is sometimes useful to speak of different subsets or subparts of a system's interface as if they were separate.
“NDA”, or “non-deterministic automaton”. (Defined in the text below.)
“non-deterministically specified system”. (Defined in the text below.)
“sequence”. A series of verification operations and their associated timing information. Timing information may be in terms of suitably quantized periodic intervals or, equivalently, at arbitrary intervals dependent on the verification operations in question.
“system”. A device or set of devices intended to accomplish a particular specified set of functions. A system could be implemented as software, hardware, or some combination of software and hardware. It is possible that the system exists in and of itself, as a model in some modeling environment, or as a combination of existence for some portion and model for others. The system has a defined interface through which it interacts with the outside world and the outside world interacts with it.
“test computation”. Computation performed during dynamic system verification other than DUT computation.
“traverse”. (Defined in the text below.)
“user”. Person or persons involved in a compliance verification of a system.
“verification environment”. The situation under which a dynamic verification run is conducted (e.g., temperature, humidity, clean room, noise, NC-Verilog™ hardware description language simulator, ModelSim™ hardware description language simulator). Together with the interface of the system, this establishes the set of verification operations that are possible to apply to the DUT as well as to the non-DUT components of the verification environment.
“verification operation”. A verification action at an instant of time. The set of possible verification operations, which includes read and write, depends on a system's interface and a verification environment.
“verification protocol”. (Defined in the text below.)
Overview
The creation of a system typically begins with the creation of a system specification (which may or may not be written down, in whole or in part) that defines the requirements for the system's functioning. These requirements may concern not only the desired function of the system but also other attributes such as power consumption, speed of operation, and so forth.
Verifying that a complex system operates in compliance with the requirements given in its system specification is difficult for many reasons. In particular,    1. The size of input that may be supplied to the system at any one time can be substantial. Even if the system has a relatively small number of individual input ports, the number of possible combinations of input values can still be very large. For example, having only 30 binary input lines still represents over a billion possible input combinations, and it is not uncommon for a system to have several times this number of inputs.    2. The system may keep an internal state. As a result, the functioning of the system at a particular time can be dependent not only on the inputs being supplied to the system at that time but also on all input that has been supplied to the system up to that time.    3. The combination of the two previous points results in an almost incomprehensibly large number of possibly unique situations arising in a given system. For example, a series of only 12 of the 30-bit inputs mentioned previously represents 212·30, or about 10108, potentially unique situations. Even if every particle in the universe were a verification system able to perform a verification test every nanosecond, there has not been enough time from the beginning of the universe to have tested all of the possible unique situations in this simple example.    4. Given the large number of unique situations arising from even limited series of inputs, exhaustive testing is not possible in the limited time and using the limited resources available today. Only a tiny fraction of the number of unique situations can actually be used for verification purposes. The quality of the compliance verification process is then highly dependent on how “good” the tests are that make up the sample from the total space of possible tests. (A “good” test is defined as one likely to detect a potential discrepancy between the DUT and its specification. A “good” set of tests is one that includes tests that do not significantly overlap in terms of which discrepancies they detect; that is, that they exhibit little-to-no redundancy.)
When searching for “good” tests or classes of tests, it is natural to attempt to control the complexity of the task by subdividing the total interface to the system into more manageable pieces and constructing tests for those pieces independently. While this kind of “divide and conquer” approach is useful in many other areas, it is not appropriate in the compliance verification of complex systems. In particular, the following difficulties are encountered.    1. Complex systems in general comprise a number of subparts which may interact internally, either directly (through, e.g., active collaboration on the accomplishment of some task) or indirectly (through, e.g., the successive use of the same shared resources). These subparts may exhibit behavior through the system's interface and these behaviors may be in active use simultaneously.    2. Another layer of complexity is therefore added by the interaction of the activity of one of these subparts with the activity of the others. In particular, not only is the behavior of a subpart dependent on the entire history of its input stream, it is also dependent on the relative timing of its activity with that of other subparts of the system who are themselves dependent on the entire histories of their own input streams.    3. As a result, it is not possible in general to separate the verification of one part of a system's interface from that of the other parts, even when the parts do not, from a user's point of view, have any relation to each other. As a particular example, presenting the same input to one part of the system's interface at different times can result in different responses, depending upon what else just happened to be occurring elsewhere in the system at that moment.    4. Defects may therefore be obscured until not only the appropriate stimulus on the pertinent part of the interface is applied, but also until that stimulus is coordinated with the complete stimulus history applied to other parts of the interface as well. More intricate and subtle tests have to be created to exercise these situations.
To users, the behavior of systems can appear to be either deterministic or non-deterministic. Behavior appears to be deterministic to a user of a system if the specification provided to said user has sufficient information to enable the computation of the precise response pattern of the system for any sequence of stimulus that is included in the system's specification. We call a specification that makes a system appear to be deterministic a “deterministic specification”. Behavior appears to be non-deterministic to a user of a system if the specification provided to said user does not have sufficient information to enable the computation of the precise response pattern of the system for a sequence of stimulus that is included in the system's specification. We call a specification that makes a system appear to be non-deterministic a “non-deterministic specification”.
While at first glance it may appear that a deterministic specification is more desirable, a system may have a non-deterministic specification for a variety of reasons.                The system may be implemented in such way that it is inherently non-deterministic (e.g., dependent on quantum decay events).        The system may have a deterministic but very complex implementation, which would be difficult for users to understand or work with. A non-deterministic specification can be considerably more concise, as well as easier to understand and work with.        The system may have a deterministic but proprietary implementation, which the system provider does not want to reveal.        The system may be a deterministic implementation of a specification that was deliberately written as a non-deterministic specification to allow a wide range of implementations. Industry-standard interface specifications are often written as non-deterministic specifications for this reason.        
We will refer to a system that has a non-deterministic specification as a “non-deterministically specified system”, or an “NDS system”. We will refer to a system that has a deterministic specification as a “deterministically specified system”, or a “DS system”.
A large portion of modern systems are NDS systems; for example, microprocessors, multi-threaded software systems (e.g., Windows, Linux), automotive control systems, avionic control systems, and so forth. An NDS system may expose subsystems which are DS systems for, e.g., specialized applications.
The manual selection of “good” tests or “good” classes of tests from the space of all possible tests is difficult for many reasons. In particular,    1. As has been described, the space of possible tests is so immense that only a tiny fraction of tests can ever be selected. As is well-known by statisticians, selecting an appropriate sample under such conditions is quite difficult. It is also difficult to define what “statistically appropriate” means when applied to compliance verification. Consequently, much of the prior art mathematical analysis that has been created for other domains may not be applicable.    2. Given the large number of dimensions along which the space can vary, it is difficult for humans to conceptualize the entire extent of possible tests. Without such conceptualization, however, it is not possible to create an efficient and spanning test strategy.    3. In response, ad hoc and intuitive approaches have been used. These methods can be somewhat successful in achieving limited levels of compliance assessment, but in general they do not provide adequate coverage across the entire space of possibilities. The tests that are produced under such strategies tend to be concentrated in those areas that “seem to need” testing, according to preconceived notions. Other areas might remain entirely uncovered.    4. The conceptualization difficulty extends to those situations that require extended setup. The intricate nature and length of the sequence necessary to establish the conditions for a particular defect to manifest itself in an observable fashion can be beyond the ability of a user to visualize even though, because of the high speed of the components, such conditions may occur within the normal functioning of the system at frequent intervals.    5. The conceptualization difficulty further extends to those situations that require interaction complexity. Systems can achieve performance advantages by reusing internal resources and performing functions in parallel. Such techniques result in complex internal interactions, not readily apparent to the user, even when all details of the implementation are known. This internal interaction complexity can also be beyond the ability of the user to visualize, especially when taken in conjunction with the extended-setup difficulty previously described.
Once a problem has been detected, it is necessary to track back to the original root cause. This root cause may have occurred much earlier during the execution of a test than the point at which an anomaly was first observed, and the anomaly may be observed in a part of the system far removed from where the defect actually is located. One way to do this analysis is to re-run the same sequence that exposed the defect, but with additional checking enabled to observe aspects of the DUT that were previously ignored. A difficulty with this approach is that numerous re-runs may be required. Another difficulty is that the set-up required to get the DUT into the state in which the root cause may occur could be quite time-consuming to execute. These two factors can compound each other.
Another approach entirely is for the user to contrive a hypothesis as to the cause of the observed defect and then construct a “small” test program that exercises just those circumstances. The construction of such a test program can be quite complex because, among other reasons, it is not always easy to identify the least amount of set-up required to get the DUT into the state that detection of the root cause requires. It is of great value to have a testing method that supports the rapid development of a sequence or set of sequences that exhibit particular properties.
Verification of a system's compliance with its specification may be performed at various stages in the system's development, either by using a model of the system at some abstraction level or by using a fabricated example of the system. Performing compliance verification by using a model before actual system fabrication can be very advantageous because of the great savings in time, money, and resources that may be gained.
For purposes of explanation only, the following discussion will focus on compliance verification by using system models as opposed to actual system implementations.
Within the context of compliance verification using a system model, the verification environment may be more precisely defined as encompassing among other items a representation of the system model (also known as the target or DUT, device under test) and a mechanism for processing the system model (typically a simulator or a model interface, which in some cases may be built into the system model itself). Within this environment, the number of possible individual verification operations (appropriate or not) that may be performed on the DUT at any one time can be very large. The number of different sequences of verification operations that may be performed on the DUT increases exponentially as a function of the complexity of the system and the lengths of the sequences. Whenever we refer to reading from or writing to the DUT, we also mean reading from or writing to any part of the verification environment, including the mechanism for processing the system model as well as any other objects.
Out of the universe of all possible sequences, the user defines the subset for which the specification defines a behavior that will result when the sequence is applied to the DUT. We call such sequences “meaningful”. The meaningful sequences can include both those that are anticipated during normal operation of the DUT and those that indicate errors. It is useful to include both types of sequences because the user may desire to ascertain not only that the DUT operates according to its specification when presented with normal inputs but also when presented with error inputs.
The definition of this subset is itself typically extremely difficult due to among other factors the non-deterministically specified behavior of the DUT that constrains what is and isn't meaningful and the need to ensure that only meaningful sequences are included.
This subset of sequences is known as the verification protocol.
Typically, testing the DUT merely by applying the verification protocol to it is not sufficient to achieve effective and efficient compliance verification. Usually additional compliance assessment computation is done during testing, typically to check that DUT behavior meets requirements. In general, compliance assessment can be described without loss of generality as a sequence of steps each of which occurs between the relevant steps of the verification protocol. Each compliance assessment step takes as its input the current state of the compliance assessment and the current observable state of the DUT, and computes a new state of the compliance assessment, or reports information to the user, or some combination of both. The combination of the verification protocol and compliance assessment will be referred to hereinafter as the compliance assessment protocol. The process of testing a DUT by applying a compliance assessment protocol to said DUT is called dynamic verification.
To meet specific compliance verification goals the user identifies which sequence(s) from the compliance assessment protocol will be made use of in any one dynamic verification run. Such goals may include the exercise of certain portion(s) of the DUT, the exercise of certain subset(s) of the system specification, completion of the dynamic verification suite within a time limit, the ability to execute the dynamic verification suite using a particular set of resources, and so forth. The user may decide to perform a given dynamic verification run repeatedly to investigate a problem. As some sequences may be infinite in length, the user may decide to make use of only a portion of a sequence. The user may also decide to make use of a sequence repeatedly.
Along with sequence selection, the user also determines when a sufficient amount of verification has been conducted to declare the DUT as in compliance with its specification, and the conditions under which the DUT will be declared as not in compliance with its specification.
A representation of the compliance assessment protocol and a mechanism for applying the compliance assessment protocol to the system model and interpreting the results are included as part of the verification environment mentioned above.
Aside from the great difficulties just described, the user often has the additional difficulties of having to create the compliance assessment protocol while the specification of the system is not in its complete, finished form. Changes to the specification may be made at any time during system verification, and such changes must be reflected by appropriate changes in the compliance assessment protocol.
Another significant issue arises from the fact that many “new” systems designs that need to be verified are not totally new, but are instead improved versions of existing system designs that have already been verified. Because creation of a compliance assessment protocol for a complex system is quite difficult, expensive, and time-consuming, it is important from both a cost perspective and time-to-market perspective that a compliance assessment protocol for a system be created in such a way that as much of it as possible can readily be reused on future or similar systems.
In the prior art, various methods have been used for expressing the compliance assessment protocol, identifying the sequence(s) or portion(s) of sequence(s) that are used in a particular dynamic verification run, maintaining and updating the specification of the compliance assessment protocol, and providing at least some measure of verification work reuse.
In general the prior art approaches may be divided into those that are purely programmatic in nature and those that use automata. The automata approaches in general include some measure of programmatic techniques to carry out imperative functions. Each of these kinds of approach is considered below.
The most common of the early programmatic approaches to verification is known as “directed testing”. Typically a directed test program implements and tests a single sequence from the compliance assessment protocol. The user manually codes the test program in some imperative programming language(s) supported by the verification environment. This test program generates some stimulus to be applied to the DUT, observes the DUT's response or reaction, and then repeats the process.
The sequence from the compliance assessment protocol implemented by a particular directed test program is chosen through the user's manual selection of the particular pattern of stimulus to generate for application to the DUT. It is the user's responsibility to ensure that the DUT is in the proper state to accept the stimulus (if any), to observe the DUT's response or reaction (if any), and to check that said response or reaction is appropriate for the applied stimulus. All of these actions must be manually coded by the user.
The cost of creating a directed test for each of the large number of sequences that are needed to test a typical modern design is generally prohibitive. Fortunately there are a number of similarities within various subsets of the desired tests, so some program code can be shared between some of the directed tests. It is a natural progression from this observation to the creation of a parameterized or configurable test program that encompasses multiple directed tests. These configurable test programs can decide which of a range of tests to perform, how to vary the internal parameters that affect the tests, and so on. Such a test program accepts as part of its input from the user an indication of which aspects of its range of testing it is to carry out during any particular dynamic verification run, acceptable value sets for various parameters and the like, and so forth.
The combination of a set of directed tests into a configurable test program is entirely manual. The user must conceive of a set of possible tests, select the subset of these tests that will effectively make use of the available verification resources by minimizing redundancy, identify the dimensions along which the selected tests may be factored, perform the factoring, and blend the fragments into the configurable test program. This work requires a higher degree of skill than the creation of individual directed test programs, for the user must not only be able to comprehend what the individual directed test programs would be like if written (so the appropriate fragments for factoring can be identified) and conceive of how the fragments may be called in sequence so the DUT progresses through the appropriate series of states, but also be able to construct a suitable mechanism for doing so.
The configurable test program contains not only the union of all of the functionality of the individual directed test programs it replaces, but also extra code to implement the configurability itself. This extra code adds knowledge of a different sort than that contained in an individual directed test program. Instead of being directed towards the generation of stimulus, the acquisition of response or reaction information, or the checking thereof, the added material relates to the higher-order pattern of exercising the DUT. There has been a mixing of knowledge about what should be verified (i.e., which of the possible tests should be applied) with how (i.e., application of stimulus, checking of response). This mixing results in a significantly more complex test program and correspondingly increases the cost in time and effort to create the test program and to ensure that it is correct.
At the cost of a substantial increase in complexity and programming skill requirements, configurable test programs provided significantly more unique test sequences than could be created using individual directed tests. However, as design complexity increased, more powerful techniques were needed to expand the number of tests. The most popular recent technique is constrained pseudo-random testing (CPRT). In this approach, a number of pseudo-random choices are incorporated into a configurable test program which extend the range of possible sequences that the program can generate. In general some of these sequences are in the compliance assessment protocol and some are not. Therefore a CPRT program typically also contains a number of constraints on the pseudo-random choices which operationally restrict the choices so that the resulting sequences are all in the compliance assessment protocol.
The CPRT approach does expand the number of tests that a configurable test program can create, but at a substantial cost in terms of simulation efficiency. To completely cover a given set of sequences typically takes considerably more simulation cycles with a CPRT program. The amount of simulation time wasted increases rapidly as the size of the target sequence set grows, typically at the rate of (N*In(N)−N), where N is the size of the target sequence set. Even for a relatively small N of twenty three thousand, on average it takes ten times as long to cover a sequence set with CPRT. For the substantially larger sequence sets that are typically needed to test complex modern designs, the cost of this can be prohibitive.
In the prior art, an alternative to the use of a programmatic approach to system verification is the use of an automata-based approach.
Non-deterministic automata can be described in a variety of ways. For example, some prior art teachings describe non-deterministic automata in terms of states and transitions, some prior art teachings describe non-deterministic automata in terms of grammars or sets of strings of terminals, and some prior art teachings describe non-deterministic automata in terms of graphs, i.e., nodes and connections. All such prior art non-deterministic automata descriptions are mathematically equivalent, and can be transformed into each other using techniques well-known to those skilled in the art.
Prior art teachings show how to apply non-deterministic automata to the problem of system verification by using them to define sequences of test program fragments (also known as “action routines”) that can be executed one after another during a dynamic verification run. For example, some prior art teachings use non-deterministic automata described in terms of a combination of states and transitions with references to test program fragments from the states, from the transitions, or from a combination of both. For example, some other prior art teachings use non-deterministic automata described in terms of a combination of grammars or sets of strings of terminals with references to test program fragments from the terminals, from the non-terminals (if any are defined in the grammars), or from a combination of both. For example, some other prior art teachings use non-deterministic automata described in terms of a combination of graphs (i.e., nodes and connections) with references to test program fragments from the nodes, from the connections, or from a combination of both. The prior art implements references to test program fragments in a variety of ways, such as pointers, lookup table entries, etc. All such prior art combination descriptions (including in said combination descriptions the references to test program fragments and the referred-to test program fragments themselves) are mathematically equivalent, and can be transformed into each other using techniques well-known to those skilled in the art. Some prior art teachings refer to the entities described by the combination descriptions as “extended non-deterministic automata”, regarding the test program fragments as “automata extensions”. Other prior art teachings simply refer to the entities described by the combination descriptions as “non-deterministic automata”, regarding the test program fragments to be part of the automata. For the sake of clarity and brevity, we will take the latter approach and use the phrase “non-deterministic automata” to refer to the entities described by the combination descriptions. The singular form of “non-deterministic automata” is “non-deterministic automaton”. For brevity, we will sometimes use the abbreviation “NDA” to refer to a non-deterministic automaton. Mathematically, the set of non-deterministic automata without test program fragments is a proper subset of the set of non-deterministic automata with some non-negative number of test program fragments, since the number can be zero.
Since non-deterministic automata can be described in a number of mathematically equivalent ways, for the sake of brevity we will select any one of the description methods whenever we make a point about them—the application of the point to non-deterministic automata described using any of the other methods should be readily appreciated by those skilled in the art. Also, prior art teaches that deterministic automata are a proper subset of the set of non-deterministic automata, and so are included whenever non-deterministic automata are discussed.
The automaton approach to testing was developed in the prior art by those verifying a DS system or a deterministic subset of an NDS system. Primarily, the domain area was single-threaded software applications. Typically in this prior art, the automaton is used to help track the states of the software application under test.
In many prior art automata applications, a graph representing the NDA's grammar (an “NDA graph”) is constructed, often by means of a program that takes as input an extended-BNF-like description and creates an NDA graph automatically. The graph is traversed to generate a terminal sequence. This terminal sequence is used in effect to stitch together in sequential order the test program fragments corresponding to the terminals to compose a specific application of the compliance assessment protocol for a dynamic verification run.
In the prior art, traversing is done in a variety of ways. “Traverse” includes any method of processing an NDA in which a plurality of components of the NDA are processed in a sequence which is defined in the specification of the NDA.
By using the automaton method, the specification of an entire universe of test programs can be constructed using a relatively modest number of test program fragments that carry out particular fundamental operations. The generation of any individual test from the specified universe can be carried out by a generic test generation program that takes a non-deterministic automaton representing a compliance assessment protocol as its input and performs a traversal of the corresponding NDA graph according to some particular traversal strategy. The application of the generated test can be carried out by a generic test application program that executes the corresponding test programs fragments in the sequential order determined by the traversal. Typically changes to the specification of the universe of test programs do not require changing the test generation program or the test application program (which may in some cases be combined into a single program).
In general, the prior art automaton approaches have several advantages over the strictly programmatic approach. The advantages are analogous to those obtained in complier development by moving from the hand-crafted creation of a, for example, recursive-descent parser to one created from a grammar-based language specification by an automatic parser generator (for example, yacc or Bison).
One advantage of the prior art automaton approaches is that they separate the specification of the compliance assessment protocol (expressed by a non-deterministic automaton) from the selection of the particular sequence(s) used in a given dynamic verification run (expressed by how the non-deterministic automaton is processed or traversed). As a result, different persons can work on either portion of the problem, improving the development, communication, review, and management aspects of the process of test program creation. Further, the scope of the compliance assessment protocol can be easily seen without the clutter of the details contained in the test program fragments. Another significant advantage is that the automaton's grammar itself can be related directly to a system's specification without having to factor out the traversal code. This provides the foundation for a more meaningful measurement of the coverage achieved by a particular dynamic verification run.
In contrast, the programmatic approach defines the compliance assessment protocol operationally, that is, by the particular code that makes up the test program. In general, there is no distinction drawn in the test program between (a) the code that generates and applies the stimulus, acquires the response or reaction of the DUT, and checks the result for appropriateness, and (b) the code that makes the actual selection of which stimulus sequence to apply. By mixing the two, it becomes much more difficult if not impossible to independently process the two kinds of knowledge. It is especially difficult for the traversal strategy since in the programmatic approach that strategy is typically implicit in how the test program was coded. The user's intent is unavoidably obscured and the strategy cannot be easily understood or modified automatically to achieve a different objective than was intended when it was originally written, for example, additional coverage objectives. The coverage that can be measured is less useful because it focuses on lines or paths of the test program's code rather than the requirements given in the system's specification. Further, since the sequence selection is manual, all of the difficulties associated with the manual selection of “good” test cases (discussed above) are encountered.
Another advantage of automaton-based approaches is that they result in more concise specifications of the compliance assessment protocol for a given level of coverage of the specification. As a result, less time is typically required for their creation, and they are easier to change, maintain, update, extend and reuse. Because of these factors, they are also likely to have fewer bugs. The total amount (and complexity) of code that the user must manually create in the programmatic approach puts severe limits on the amount of coverage a user can achieve in a given amount of time.
The advantages of automaton-based approaches over programmatic approaches are a function of the degree to which the extent of the verification protocol is expressed in the automaton's grammar instead of in the test program fragments. For a given compliance assessment protocol, the more the extent of the verification protocol is expressed in the grammar, the more the advantages of the automaton-based approach can be realized. In contrast, the more the extent of the verification protocol is expressed in the test program fragments, the closer it comes to the programmatic approach with all of the disadvantages described above.
The prior art automaton approaches have primarily been applied to testing systems, such as single-threaded software programs, in which non-deterministically specified behaviors are either non-existent or highly localized. In such verification applications, most of the extent of the verification protocol can readily be expressed in the automaton's grammar without any need for the ability to dynamically constrain the series of terminal generation (equivalent to dynamically constraining the traversal of an NDA graph) during a dynamic verification run. However, when prior art approaches are applied to more complex systems, such as multi-threaded microprocessor based computer chips, in which non-deterministically specified behaviors are not highly localized, the user is forced to shift much of the expression of the verification protocol's extent from the automaton's grammar into the test program fragments. As a result, the test program fragments become substantially larger and more complex, and much of the advantage of using an automaton-based approach instead of a programmatic approach is lost.
We can see, therefore, that prior art automaton approaches have several advantages over programmatic approaches, but also rather severe limitations on the degree to which the extent of the verification protocol can be represented in the automaton's grammar. These limitations have prevented the automaton approach to verification from being used in the compliance verification of many complex systems.
A problem with the programmatic approach is that a substantial amount of time and expense is required for test conceptualization, development, update, and maintenance. These costs are often significantly greater than those incurred for the creation of the design of the system itself. Test conceptualization is especially difficult using the programmatic approach because of the test selection problem described previously. The test programs developed using this approach are large pieces of software. All of the well-known difficulties of development, update, and maintenance of large, complex software applications must therefore be surmounted.
A problem with the programmatic approach is that the reduction of observed anomalies to root causes or defects is very costly in time and resources. To decrease the amount of time required to recreate the situation that exhibits an anomaly, the user typically attempts to minimize the set-up sequence. However, determining how to do this appropriately and, especially, correctly editing the test program to achieve it under the programmatic approach is very costly because of the aforementioned difficulties of working with large pieces of complex software.
A problem with the programmatic approach is that test creation early in the product development process is very costly in time and resources. Since a large test program is typically required for verification, it is advantageous to begin its construction as soon as possible. However, early in the product development process it is quite normal for the system specification to change frequently. Correctly identifying and revising the test program code affected by these changes is very costly because of the aforementioned difficulties of working with large pieces of software.
A problem with the programmatic approach is that it requires relatively high levels of software development skill from the developers of tests because of the aforementioned difficulties of working with large pieces of complex software.
A problem with the programmatic approach is that substantial numbers of defects are typically found in the test program while it is being used to verify the target system. It has been observed in the prior art that the number of defects found in the test software is typically at least as large as the number found in the target system itself. Investigating and fixing these defects consume large amounts of time and resources. This is a natural consequence of the fact that the programmatic method requires the creation of substantial amounts of complex software to implement the tests. As is well-known to those skilled in the art, the incidence of defects in complex software is strongly correlated to the number of lines of code written.
A first level of measure of the value of a new test sequence when added to a set of existing test sequences is whether or not it is the same as any of the sequences already in the set, i.e., does it exercise anything different. A test sequence that is not different in at least some aspect is typically not worth adding to the set. A next level of measure of value is how different the new test sequence is, i.e., how many different aspects of the system's specification are covered. From this concept, we can then consider the breadth of coverage of a given set of test sequences as its span. Given that it is not possible typically to exercise every possible aspect of a system's behavior, it is valuable to spread the verification effort across as much of the system specification space as possible by creating a set of unique test sequences with as large a span as possible. However, because of the test selection problem described previously, this is very difficult when using the programmatic approach.
As with any other large software project, the creators of programmatic approach tests have certain goals when writing test programs: getting the programs done, avoiding defects, making the execution fast, and so forth. These goals are not the same as those when the users have to communicate with the various verification process stakeholders (customers, designers, colleagues, management, etc.). At those times, it's more important for the users to communicate an understanding of the complex body of work represented by the tests. In these cases, hierarchy, abstraction, progressive disclosure of detail, and so forth, are required. These two sets of goals are typically not compatible for large, complex software projects. In the programmatic method, the same difficulties are encountered.
A serious problem with prior art automaton-based methods is their inability to dynamically change the traversal of the automaton or the automaton itself to handle various non-deterministically specified behaviors as they are exhibited by the DUT during verification. For many complex systems, this problem severely limits the degree to which the extent of the verification protocol can be represented in the automaton's grammar. This has prevented prior art automaton approaches to verification from being used in the compliance verification of many complex systems.