Most high level languages contain a language construct that is typically viewed as a multi-way branch, in which a runtime variable is compared with each of the members of a set of constants, and a statement selection is made based on the result of the comparisons. Such constructs are sometimes known as case or switch statements. The runtime variable, also known as the "case selector", which can be an expression, is tested against a set of specified constants, also known as "case items". As is well-known in the art, multi-way branches utilize various selection methods in order to perform the required logic, including range tests, jump tables, compare-and-branch sequences, address translation and arithmetic progression. The method which yields the best performance in execution varies greatly with the underlying architecture of the machine on which the compiled program is executed, and also with the distribution of the values in the selection set, that is, the set of constants against which the runtime variable is to be compared.
Known attempts to overcome the problem of producing the best performing code, have generally related to the particular machine architecture for which the program is being compiled, that is, the target machine. This has been done by including in the solving method, and in the heuristic information, assumptions about the code formats which are optimal for that machine. For example, since multi-way branching is a commonly used construct in high level languages, architects of some Complex Instruction Set Computer (CISC) Systems have built instructions into such computers which provide table indexed branching at relatively little cost. Implementers of multi-way branching logic for such architectures frequently tailor their solution method so as to always make use of this instruction, even though for some collections of data it is not the best performer. The known prior art methods produce code that is optimal for only one machine, and under one set of conditions of the selection set that is to be compared with the runtime variable.
R. Bernstein, "Producing Good Code for the Case Statement," IBM RESEARCH REPORT 10525, (#45755) Dec. 6, 1983, describes some considerations to be used in determining which test for a case statement is appropriate, where available tests include jump table, range test, binary search and linear search, and discloses a method of selection starting with all the case items as a single cluster, sometimes called a "set", and breaking into smaller and smaller clusters until each meets the minimum case-density requirements for a jump table solution. Then the several types of tests are performed in a predesignated sequence, always beginning with a jump table.
A very general discussion of optimization methods is given in Chapter 10 of Aho, Sethi and Ullman, COMPILERS: PRINCIPLES, TECHNIQUES AND TOOLS, Addison-Wesley (1988). Other relevant prior art references are L. Atkinson, "Optimizing Two-state Case Statements in Pascal", SOFTWARE--PRACTICE AND EXPERIENCE, 12:6 (1982), 571-581; and J. Hennessey and N. Mendelsohn, "Compilation of the Pascal Case Statement", SOFTWARE--PRACTICE AND EXPERIENCE, 12:9 (1982), 879-882.
Despite these attempts, there remains a need for an optimizing method for evaluating case statements at minimum cost of execution which not only handles all numbers and densities of case items likely to be encountered in compilation, but also is portable, i.e., which can be used on target machines having differing architectures.