The invention relates to the automated design of electronic systems, and in particular, to the automated design of Explicitly Parallel Instruction Computing (EPIC) architectures.
As the workstation and personal computer markets are rapidly converging on a small number of similar architectures, the embedded systems market is enjoying an explosion of architectural diversity. This diversity is driven by widely-varying demands on processor performance and power consumption, and is propelled by the possibility of optimizing architectures for particular application domains. Designers of these application specific instruction-set processors (ASIPs) must make tradeoffs between cost, performance, and power consumption. In many instances, the demands for a particular application can be well served by using a processor having an Explicitly Parallel Instruction Computing (EPIC) architecture. One form of EPIC processor is a very long instruction word (VLIW) processor.
VLIW processors exploit instruction-level parallelism (ILP) by issuing several operations per instruction to multiple functional units. A VLIW processor design specifies the processor""s datapath and control path. The datapath includes the functional units for executing operations, registers for storing the inputs and outputs of the operations, and the interconnect for transferring data between the functional units and registers. The control path provides control signals to the control ports in the datapath based on a program, which is either read from memory or hardwired into the control logic.
In addition to supporting explicit instruction level parallelism, EPIC processors may also support additional features to improve processor performance and efficiency. These features include hardware support for speculation, predication, and data speculation. Other features include rotating registers and special branch instructions for executing software pipelines with enhanced efficiency. Throughout this document, references to a VLIW processor are intended to broadly encompass EPIC processors.
VLIW processors can be grouped into two categories: xe2x80x9cprogrammablexe2x80x9d and xe2x80x9cnon-programmablexe2x80x9d. Programmable VLIW processors are processors that can be programmed by users. The instruction set of these processors is visible to the programmer/compiler so that a programmer can write programs either directly in the machine code or in a high level language that is then compiled to the machine code. These processors are connected to a xe2x80x9cprogram memoryxe2x80x9d that is used to store the program to be executed. Typically, the program memory is part of the memory system that stores both data and programs, and it is implemented using RAM (random access memory) that can be both read and written.
Non-programmable VLIW processors are designed to execute a specific application or a fixed set of applications. The primary difference between programmable and non-programmable processors lies in the way that the control logic is implemented. In programmable processors, the control logic includes hardware components for fetching user specified instructions from memory, issuing these instructions for execution, and decoding the instructions. In non-programmable processors, the control logic does not accommodate user modified programs. Instead, the control logic is specifically adapted for a particular program. In a microprogram approach, the program is represented as a series of wide words stored in memory. The control logic reads the program words, decodes them, and issues them to the control ports of the datapath. This type of processor is non-programmable in implementations that do not allow the user to modify the program. In a hard-wired approach, the program is hard-wired in control logic, such as a finite state machine, that issues control signals to the processor""s datapath.
In designing a VLIW processor, a number of cost/performance trade-offs need to be made. Each of these trade-offs can have a substantial impact on the overall system cost and performance. Unfortunately, designing a VLIW processor today is a fairly cumbersome manual process which must carefully weigh cost and performance tradeoffs in the light of resource sharing and timing constraints of the given micro-architecture. Optimizations and customizations of the processor, if any, with respect to a set of applications or an application domain must also be determined and applied manually.
One research effort has focused on the automated design of ASIPs based on a special type of processor architecture called the Transport Triggered Architecture (TTA). See MOVE citation. Automated design of a processor is particularly important for ASIPs because it makes it possible to evaluate a number of different processor configurations in a process called xe2x80x9cdesign space exploration.xe2x80x9d Design space exploration refers to a programmatic search procedure used to investigate some or all possible processor designs in a parameterized space in an automated fashion. The design space of even a simple processor model is large, and exhaustive search strategies are of little practical use. Practical schemes can explore only a small subset of the total parameterized space of processors.
The published work on TTA processors cited above outlines a method for automated design space exploration of candidate processors based on their cost (e.g., chip area, number of pins, power dissipation and code size) and performance (i.e. the inverse of execution time). This approach is limited because it does not incorporate statistics about internal resource usage of system components in the design exploration process.
The invention provides a programmatic system and method for exploring the design space of a VLIW computer. The term xe2x80x9cprogrammaticxe2x80x9d refers to a system or method implemented in a program module or set of program modules. The system and method allow system designers to evaluate many candidate processor designs in an automated fashion.
One aspect of the invention is a programmatic method for designing a VLIW processor using feedback about internal resource utilization. This method reads a specification of a candidate VLIW processor, which describes a specific instance of a parameterized processor design. It then obtains internal resource usage statistics for the candidate processor. For example, in one implementation, a VLIW synthesis process programmatically generates a hardware description of the processor. A compiler, re-targeted to the candidate processor, generates operation issue statistics for an application program to be executed in the candidate processor. The operation issue statistics provide information about how the candidate processor issues operations during execution of the program, such as the quantity, frequency, and timing of the issuance of an operation or set of operations. For example, the statistics may specify how often selected operations are issued concurrently. By mapping these statistics to internal resources such as hardware macrocells, register ports or instruction fields, the design method determines how the processor""s operations or hardware components are used during execution of the program. Each operation in a processor""s input specification maps to a functional unit that executes it, and the register ports and instruction fields it utilizes when executed in the processor.
Based on these internal resource usage statistics, the method determines a new candidate processor or set of processors and provides an input specification for each new processor. The method then programmatically generates a description of the new candidate processor in a hardware description language from the new specification. It is not necessary to synthesize a complete detailed structural description of each new candidate processor to evaluate it during the design space exploration process. To expedite the design space exploration, it is possible to evaluate a candidate processor based on only a partial synthesis of its structural design or based an abstract, non-structural instruction set architecture specification. Depending on the criteria used to evaluate a candidate, it is possible to evaluate a candidate processor based on the description of the new candidate processor, or based on a high-level structural processor design synthesized from the description. The process of specifying and evaluating candidate processors may be repeated to explore the parameterized design space in search of candidate processors that satisfy the design objectives, such as execution speed, chip area, circuit complexity, power consumption, etc.
Another aspect of the invention is a programmatic method for designing a VLIW processor using abstract, non-structural parameters to specify a candidate processor or set of potential candidates. Like the method summarized above, this method selects a new candidate or candidates based on information derived from a previous candidate processor, but this information may be an external metric such as cost or performance or an internal metric such as internal resource usage. The new candidate processor is specified in terms of non-structural parameters, namely, processor operations or instruction level parallelism constraints among the processor""s operations.
Another aspect of the invention is a programmatic method for designing a VLIW processor based on an evaluation of a prior candidate processor or set of processors, optionally including an evaluation based on the synthesized instruction format for a prior candidate. In addition to providing a hardware description of the VLIW processor, this method also designs its instruction format. In some cases, the instruction format may be used to create a hardware description of the processor""s control logic. In addition, the instruction format may be used to evaluate the static and dynamic code size of an application program to be executed on the candidate processor.
One implementation of the invention is an automated design system comprising a set of program modules. The system includes components for designing a VLIW processor and evaluating its cost and performance. The design components include a datapath synthesizer, instruction format designer, and control path synthesizer. The datapath synthesizer reads an abstract instruction set architecture specification, including an opcode repertoire, and instruction level parallelism constraints on operations in the opcode repertoire, and programmatically generates a datapath specification from a macrocell library. The datapath includes instances of functional units, register files and an interconnect between data ports of the functional units and register files.
The instruction format designer programmatically generates an instruction format from the datapath specification and the abstract instruction set architecture specification. This instruction format includes instruction templates representing VLIW instructions executable in the VLIW processor, instruction fields of each of the templates, and bit positions and encodings for the instruction fields. The control path synthesizer programmatically generates a control path specification from the instruction format and datapath specification.
The system also includes a program module called the MDES extractor that extracts a machine description suitable to re-target a compiler. The machine description, referred to as xe2x80x9cMDES,xe2x80x9d provides resource conflict constraints derived from a traversal of a structural description of the processor""s datapath. It also provides a specification of the input/output format of the processor""s operations. Parameterized by this MDES, a re-targetable compiler generates operation issue statistics for a program executing on a candidate processor.
The components for evaluating the processor include a cost evaluator for evaluating cost of a synthesized VLIW processor, and a performance evaluator for evaluating performance of an application program executed on the synthesized VLIW processor. The cost evaluator determines a processor""s cost in terms of the chip area that it occupies, while the performance evaluator determines its performance in terms of how fast it executes a specified program. Other criteria for evaluating a processor""s cost/performance may be used as well. For example, the system may evaluate a processor based on the power it consumes by summing the power consumed by each of the hardware macrocells in its design. Also, since internal usage information is available, power consumption can be estimated based on how frequently each macrocell is used for a particular application program.
Finally, the system includes a spacewalker for selecting a candidate VLIW processor for synthesis by the datapath synthesizer, control path synthesizer and instruction format designer, based on the cost and performance of the synthesized VLIW processor. The spacewalker may operate in conjunction with procedures for extracting internal resource utilization information from candidate processors. These procedures translate resource usage information into processor parameters used to specify a new candidate processor.
Further features of the invention will become apparent from the following detailed description and accompanying drawings.