The ability to efficiently migrate programs, programmers, hardware designs and hardware designers among various tools motivates the need to measure the conformity between a specific tool and a specific language standard for which the tool claims conformity. Such conformity assessment is often done by applying many test cases to a tool under test and evaluating the tool's response (black-box testing). Use of tools which conform well and in known ways to a language standard reduce training costs, development costs, maintenance costs, time to market and risk. However in the current art, apparatus and methods for development and maintenance of the required language conformity results in sub-optimal measurement fidelity at undesirably high cost.
A commercially interesting language generally has several hundred lexical and syntactic productions paired with several hundred semantic constraints. For example, a lexical production may characterize an identifier lexical token as having a leading alphabetic character followed by zero or more alpha-numeric characters or underscores such that at most one underscore appears sequentially. Perhaps the identifier then appears in a syntactic production for a variable declaration. The example variable declaration syntax includes one or more identifier lexical tokens, a colon lexical token, an identifier denoting a previously defined type, an equals lexical token and an initial value. The initial value is a non-terminal production defined in terms of other non-terminal productions or lexical tokens (representing syntactically terminals). Examples of semantic constraints include nested block structure, controlling the set of type definitions which are visible, and the compatibility between the type of the declaration and the initial value, perhaps requiring that both are integer type.
In order to achieve high-fidelity, language conformance testing must consider the cross product of the lexical productions, syntactic productions and semantic constraints. For example, the sequence of test cases applied to a tool under test must not only consider the variable declaration in isolation, but also in a myriad of contexts including nested blocks and other declarations. With tens of lexical productions, hundreds of syntactic productions (some of which are optional or repeated arbitrarily) and hundreds of semantic constraints, one can readily see that generating millions of carefully chosen test cases would be desirable to achieve high-fidelity language conformance testing.
Language conformance tests must include both language productions which are specifically allowed and those that are disallowed so that the conformance testing process can detect both correct language tests which are (improperly) disallowed by a tool under test and incorrect language tests which are (improperly) allowed by a tool under test. In order to evaluate conformance, both sets of tests must be generated such that the conformity of the test to the language standard is known independent of applying the test to any tool under test (classification).
Current art in the generation of language conformance tests relies largely on manual test case generation and manual classification. Since manual editing and classification of a test case typically requires an expert between fifteen minutes and an hour per test, it is seldom economically feasible to generate test suites with more than ten thousand test cases. Despite their cost, such manually generated suites fall substantially short of the millions of test cases required for high-fidelity validation. Such current art is an economic compromise between the desire for millions of carefully chosen test cases and the test case development effort which is economically affordable.
When a test suite is manually generated, encompassing a small fraction of the desired language validation space, a human is unlikely to touch on a significant number of the nonintuitive test cases which may arise during practical use of a tool under test. Humans are not well suited to impose generation rigor spanning thousands of test cases. As a result, test suite fidelity is compromised during manual test case generation.
A useful language standard undergoes periodic revision. Since such revisions alter the language definition, the revisions must be reflected in the associated language conformance test suites in order to maintain a high-fidelity validation suite. The manual effort required to identify and modify test cases impacted by a language revision is significant. Numerous lexical productions, syntactic productions and semantic constraints go into the definition of a single test case. A complete, manually generated cross index of language definition points and test cases is generally not feasible. Maintenance of manually generated suites is thus an expensive process with sub-optimal fidelity.
Manually generated test cases are initially classified by the test case author (good/bad). Such manual classification is refined by iterative application of the manually generated test cases to a sequence of tools under test. Any discrepancies between the manual classification and the tool response must be manually interpreted and considered for test re-classification. Such a process is expensive, error-prone and relies on the availability of many tools under test for a given language standard in order to approach high fidelity. Such a process never directly identifies test cases needed to discriminate between correct and incorrect language which are missing from the manually generated test suite.
In the current art, a single test case may be automatically permuted by character replacement in order to yield a sequence of closely related test cases. For example, a test case may be permuted to write various types and/or values into a file. Such automatic permutation spans a small space within the set of desired tests; generally a single set of syntactic productions and semantic constraints (common to all permuted tests).
Current state of the art in manual test suite development or automatically permuted test cases results in sub-optimal conformance testing fidelity of a tool under test, high development cost and high maintenance cost. An apparatus and means achieving higher fidelity conformance testing with lower development and maintenance effort, as disclosed in the present invention, is novel and useful.