The invention relates to programmable processors in general, and the automated design of instruction formats for Explicitly Parallel Instruction Computing (EPIC) Architectures. In this document, references to Very Long Instruction Word (VLIW) processors are meant to broadly encompass EPIC architectures.
As the workstation and personal computer markets are rapidly converging on a small number of similar architectures, the embedded systems market is enjoying an explosion of architectural diversity. This diversity is driven by widely-varying demands on processor performance and power consumption, and is propelled by the possibility of optimizing architectures for particular application domains. Designers of these application specific instruction-set processors (ASIPs) must make tradeoffs between cost, performance, and power consumption. In many instances, the demands for a particular application can be well served by using a very long instruction word (VLIW) architecture processor.
VLIW processors exploit instruction-level parallelism (ILP) by issuing several operations per instruction to multiple functional units. The processor""s machine language provides the interface between hardware and software, while the instruction format specifies the precise syntax and binary encodings of all instructions in the machine language. A key step in processor design is the design of the instruction format. Compact instruction encodings reduce overall program size and improve instruction cache performance, but may require more costly instruction alignment and decode hardware. Simple encodings permit faster and less expensive alignment and decode hardware, possibly at the expense of increased program size. These tradeoffs between hardware complexity and code size can have a substantial impact on the overall system cost and performance. Unfortunately, designing the instruction encoding for a VLIW processor today is a fairly cumbersome manual process which must carefully weigh the above-mentioned cost and performance tradeoffs in the light of resource sharing and timing constraints of the given micro-architecture. Optimizations and customizations of the instruction encodings, if any, with respect to a set of applications or an application domain must also be determined and applied manually.
The invention provides a computer-implemented method for automatic design of efficient binary instruction encodings. The method automatically finds compact instruction formats that express and exploit the full parallelism specified in the underlying processor microarchitecture, subject to constraints on alignment and decode hardware complexity. Furthermore, the method can be guided by statistics about the composition and frequency of program instructions, so that the instruction format design is customized to a particular set of applications or an application domain.
This instruction format design method can be used in many different ways. It can be used as an assistant in the process of the manual design of general-purpose and application-specific processors, or for optimizing or customizing existing architectures to new application domains. It also enables automated design-space exploration of processor architectures by providing a much faster turnaround in designing and evaluating the cost and performance of instruction encodings for each point in the design-space.
The instruction format design process is implemented in a collection of program modules. These modules may be used individually or in a variety of combinations for unique instruction format design scenarios. In one such scenario, the design process takes an abstract Instruction Set Architecture (ISA) specification, a datapath specification, and optionally, custom instruction templates, and programmatically generates a specification of a bit allocation problem, which specifies the instruction fields for each template along with constraints and bit width requirements that control the allocation of bit positions to each field. In another scenario, the design process takes the bit allocation problem specification, and programmatically allocates bit positions to the instruction fields. These design scenarios may be combined to create a concrete ISA specification from the abstract ISA specification and the datapath programmatically.
In another scenario, the design process programmatically generates custom instruction templates based on operation issue statistics that indicate how a particular application program uses various processor operations. For example, a custom template selection module selects instruction templates that minimize certain cost functions, such as one that quantifies the code size. As alluded to above, the instruction design process can take a combination of custom templates and an abstract ISA specification and generate a concrete ISA specification. This approach may be used to generate a new concrete ISA specification or optimize an existing one.
In yet another scenario, the instruction format design process takes a concrete ISA specification and a list of instruction level parallelism constraints on specified processor operations, and programmatically generates an optimized concrete ISA specification. One unique aspect of this scenario is a program module that extracts an abstract ISA specification from the concrete ISA specification. This enables other program modules outlined above to take the combined ILP constraints and extracted abstract ISA specification and programmatically generate the bit allocation problem specification and allocate bit positions to each of the instruction fields.
Further advantages and features will become apparent with reference to the following detailed description and accompanying drawings.