1. Field of Invention
The present invention relates in general to the digital data processing field. More particularly, the present invention relates to adaptive code generation for extensible processor architectures.
2. Background Art
In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users.
A modem computer system typically comprises at least one central processing unit (CPU) and supporting hardware necessary to store, retrieve and transfer information, such as communications buses and memory. It also includes hardware necessary to communicate with the outside world, such as input/output controllers or storage controllers, and devices attached thereto such as keyboards, monitors, tape drives, disk drives, communication lines coupled to a network, etc. The CPU or CPUs are the heart of the system. They execute the instructions which comprise a computer program and direct the operation of the other system components.
The overall speed of a computer system is typically improved by increasing parallelism, and specifically, by employing multiple CPUs (also referred to as processors). The modest cost of individual processors packaged on integrated circuit chips has made multi-processor systems practical, although such multiple processors add more layers of complexity to a system.
From the standpoint of the computer's hardware, most systems operate in fundamentally the same manner. Processors are capable of performing very simple operations, such as arithmetic, logical comparisons, and movement of data from one location to another. But each operation is performed very quickly. Sophisticated software at multiple levels directs a computer to perform massive numbers of these simple operations, enabling the computer to perform complex tasks. What is perceived by the user as a new or improved capability of a computer system is made possible by performing essentially the same set of very simple operations, using software having enhanced function, along with faster hardware.
In the very early history of the digital computer, computer programs which instructed the computer to perform some task were written in a form directly executable by the computer's processor. Such programs were very difficult for a human to write, understand and maintain, even when performing relatively simple tasks. As the number and complexity of such programs grew, this method became clearly unworkable. As a result, alternative forms of creating and executing computer software were developed. In particular, a large and varied set of high-level languages was developed for supporting the creation of computer software.
High-level languages vary in their characteristics, but all such languages are intended to make it easier for a human to write a program to perform some task. Typically, high-level languages represent instructions, fixed values, variables, and other constructs in a manner readily understandable to the human programmer rather than the computer. Such programs are not directly executable by the computer's processor. In order to run on the computer, the programs must first be transformed into a form that the processor can execute.
Transforming a high-level language program into executable form requires the human-readable program form (i.e., source code) be converted to a processor-executable form (i.e., object code). This transformation process generally results in some loss of efficiency from the standpoint of computer resource utilization. Computers are viewed as cheap resources in comparison to their human programmers. High-level languages are generally intended to make it easier for humans to write programming code, and not necessarily to improve the efficiency of the object code from the computer's standpoint. The way in which data and processes are conveniently represented in high-level languages does not necessarily correspond to the most efficient use of computer resources, but this drawback is often deemed acceptable in order to improve the performance of human programmers.
While certain inefficiencies involved in the use of high-level languages may be unavoidable, it is nevertheless desirable to develop techniques for reducing inefficiencies where practical. This has led to the use of compilers and so-called “optimizing” compilers. A compiler transforms source code to object code by looking at a stream of instructions, and attempting to use the available resources of the executing computer in the most efficient manner. For example, the compiler allocates the use of a limited number of registers in the processor based on the analysis of the instruction stream as a whole, and thus hopefully minimizes the number of load and store operations. An optimizing compiler might make even more sophisticated decisions about how a program should be encoded in object code. For example, the optimizing compiler might determine whether to encode a called procedure in the source code as a set of in-line instructions in the object code.
Processor architectures (e.g., Power, x86, etc.) are commonly viewed as static and unchanging. This perception is inaccurate, however, because processor architectures are properly characterized as extensible. Although the majority of processor functions typically do remain stable throughout the architecture's lifetime, new features are added to processor architectures over time. A well known example of this extensibility of processor architecture was the addition of a floating-point unit to the x86 processor architecture, first as an optional co-processor, and eventually as an integrated part of every x86 processor chip. Thus, even within the same processor architecture, the features possessed by one processor may differ from the features possessed by another processor.
When a new feature is added to a processor architecture, software developers are faced with a difficult choice. A computer program must be built either with or without instructions supported by the new feature. A computer program with instructions requiring the new feature is either incompatible with older hardware models that do not support these instructions and cannot be used with them, or older hardware models must use emulation to support these instructions. Emulation works by creating a trap handler that captures illegal instruction exceptions, locates the offending instruction, and emulates its behavior in software. This may require hundreds of instructions to emulate a single unsupported instruction. The resulting overhead may cause unacceptable performance delays when unsupported instructions are executed frequently.
If emulation is not acceptable for a computer program, developers may choose either to limit the computer program to processors that support the new feature, or to build two versions of the computer program, i.e., one version that uses the new feature and another version that does not use the new feature. Both of these options are disadvantageous. Limiting the computer program to processors that support the new features reduces the market reach of the computer program. Building two versions of the computer program increases the cost of development and support.
In certain object-oriented virtual machine (VM) environments, such as the Java and .NET virtual machines, this compatibility problem is solved by using just-in-time (JIT) compilation. A JIT compiler recompiles code from a common intermediate representation each time a computer program is loaded into the environment. Each computer may have a different JIT compiler that takes advantage of the features present on that computer. This is very helpful, but has a number of drawbacks. One drawback is that recompilation occurs frequently, i.e., each time the computer program is loaded. Another drawback is that JIT compilation is not a solution in non-VM environments. The vast majority of computer programs in use today are statically compiled code, and this is expected to remain the case for many years.
Because of the problems involved with exploiting new features, software developers typically will not do so until the features become common on all supported computers on their platform. This often leads to an extraordinarily lengthy time lapse between introduction of the hardware features and their general acceptance. For example, five or more years may pass between implementation of a new hardware feature and its exploitation.
A need exists for a more flexible system that allows computer programs to automatically take advantage of new hardware features when they are present, and avoid using them when they are absent.