The invention relates to programmable and non-programmable processors having Explicitly Parallel Instruction Computing Architectures (EPIC).
As the workstation and personal computer markets are rapidly converging on a small number of similar architectures, the embedded systems market is enjoying an explosion of architectural diversity. This diversity is driven by widely-varying demands on processor performance and power consumption, and is propelled by the possibility of optimizing architectures for particular application domains. Designers of these application specific instruction-set processors (ASIPs) must make tradeoffs between cost, performance, and power consumption. In many instances, the demands for a particular application can be well served by using an ASIP having an Explicitly Parallel Instruction Computing (EPIC) architecture. One form of EPIC processor is a very long instruction word (VLIW) processor. Throughout this document, references to a VLIW processor are intended to broadly encompass EPIC processors.
VLIW processors exploit instruction-level parallelism (ILP) by issuing several operations per instruction to multiple functional units. A VLIW processor design specifies the processor""s datapath and control path. The datapath includes the functional units for executing operations, registers for storing the inputs and outputs of the operations, and the interconnect for transferring data between the functional units. The control path provides control signals to the control ports in the datapath based on a program, which is either read from memory or hardwired into the control logic.
VLIW processors can be grouped into two categories: xe2x80x9cprogrammablexe2x80x9d and xe2x80x9cnon-programmablexe2x80x9d depending on how their control logic is implemented. Programmable VLIW processors are processors that can be programmed by users. The instruction set of these processors is visible to the programmer/compiler so that a programmer can write programs either directly in the machine code or in a high level language that is then compiled to the machine code. These processors are connected to a xe2x80x9cprogram memoryxe2x80x9d that is used to store the program to be executed. Typically, the program memory is part of the memory system that stores both data and programs, and it is implemented using RAM (random access memory) that can be both read and written.
Non-programmable VLIW processors are designed to execute a specific application or a fixed set of applications.
In the context of our design flow, the difference between the design of programmable and non-programmable processors lies in the way control logic is designed. There are the following two broad approaches for designing the control logic.
1. Finite state machine (FSM) based control: In this approach, there is no program stored in memory; the processor contains all the control logic in the form of a finite state machine. The FSM can be implemented using hard-wired logic in which case the processor is non-programmable and can execute only one program. It can also be implemented using xe2x80x9creconfigurablexe2x80x9d hardware such as FPGAs or certain types of PLAs. In this case, the processor can be re-configured to execute a different program.
2. Program counter based control: In this approach, the control is expressed in the form of a program consisting of a sequence of instructions stored in a program memory. The processor contains a program counter (PC) that contains the memory address of the next instruction to execute. In addition, the processor contains control logic that repeatedly performs the following sequence of actions:
A. Fetch instruction from the address in the PC.
B. Decode the instruction and distribute the control to the control points in the processor datapath.
C. Update the PC as follows. If the instruction just executed contains either an implicit or explicit branch and the branch was taken, then the new value of the PC is the branch target address specified in the instruction. In all other cases, the next value of the PC is the address of the next instruction in the program.
Instruction representation in this approach can take the following two forms:
1. Horizontal instructions: In this approach, the value of each control point is specified directly. A (micro-) program consists of a sequence of wide words, each of which specifies the values for all the control points for an execution cycle. The main advantage of this approach is that the instruction format and the logic in the processor to decode the instruction are very simple. Processors with this type of control are referred to as horizontally micro-programmed processors.
2. Encoded instructions: This approach uses some form of encoding to reduce the size of instructions. It may group control points in easily identifiable groups such as opcode, source and destination register specifiers, etc. Since control is now encoded, the instruction format and the decode logic become more complex. The degree of encoding varies from processor to processor and directly affects the complexity and the cost of the decode logic. General-purpose microprocessors use this type of control.
The program to be executed can be stored in either RAM or ROM (i.e., read only memory). In the first case, the processor can be made to execute different programs at different times by loading the appropriate program in the program memory. Thus, such systems can be called programmable systems. In the latter case, there is only one program stored in the memory. Thus, the processor can execute only a single program and is non-programmable.
The following table summarizes the various ways in which control can be implemented.
In designing a VLIW processor, a number of cost/performance trade-offs need to be made. Each of these trade-offs can have a substantial impact on the overall system cost and performance. Unfortunately, designing a VLIW processor today is a fairly cumbersome manual process which must carefully weigh cost and performance tradeoffs in the light of resource sharing and timing constraints of the given micro-architecture. Optimizations and customizations of the processor, if any, with respect to a set of applications or an application domain must also be determined and applied manually.
Manual design of ASIPs is time consuming, and thus, it is not very conducive to the widespread use of ASIPs, since time to market is very important. To reduce the design time, designers of embedded systems need a system that can generate ASIPs from functional or algorithmic description of applications in an automatic way.
The invention provides a method and system for automatic and xe2x80x9cprogrammaticxe2x80x9d design of VLIW processors. The features of the invention may be used to design general purpose and application-specific processors as well as sub-components of these processor designs. These sub-components include the processor""s datapath, control path, and in the case of a programmable processor, its instruction format.
The method may be used to design programmable and non-programmable VLIW processors. For both programmable and non-programmable processors, the method builds the processor""s datapath from an abstract Instruction Set Architecture (ISA) specification and a macrocell library of hardware components. The abstract ISA specifies the processor""s desired operations (e.g., an opcode repertoire), instruction level parallelism (ILP) constraints among the operations, the I/O format of the operations and the number and type of register files in the processor. The method builds a description of the datapath by selecting instances of functional units that satisfy the ILP constraints. It then allocates register file ports to data ports of the functional units and synthesizes the interconnect between these ports using components from the macrocell library. The resulting datapath specification includes functional unit instances, register file instances, and the interconnect between the functional units and register files.
For programmable processors, the method programmatically generates the processor""s instruction format based on information from the datapath design and the abstract ISA specification. The method selects instruction templates representing the processor""s VLIW instructions based on the ILP constraints. It then sets up a bit allocation problem specification that sets forth the instruction field bit width requirements and the mapping of each instruction field to the control ports in the datapath. Finally, the method allocates bit positions in the processor""s instruction register to each of the instruction fields. The instruction format specifies the processor""s VLIW instructions, the instruction fields for each of the instructions, and bit positions and encodings for the instruction fields.
The design of the control path depends on whether the processor is programmable or non-programmable. For a programmable processor, the method programmatically generates a hardware description of the control path using information from the processor""s instruction format and its datapath. In particular, it programmatically generates a description of the macrocell instances in the instruction unit""s datapath from the instruction cache to an instruction register. It also computes logic tables that specify the control logic coupling an instruction sequencer to the instruction unit""s datapath and the decode logic coupling the instruction register to the control ports in the datapath. For non-programmable processors using a micro-program control approach, the design flow of the control path is similar. For non-programmable processors using hard-wired control, the design flow does not create an instruction format, and instead, synthesizes the control logic (e.g., a finite state machine) from a scheduled application program.
This VLIW design method can be used in many different ways. It can be used as an automatic system to design ASIPs from application programs. It can be used as an assistant in the process of the manual design of general-purpose and application-specific processors, or for optimizing or customizing existing architectures to new application domains. It also enables automated design-space exploration of processor architectures by providing a much faster turnaround in designing and evaluating the cost and performance of processor designs for each point in the design-space.
The VLIW design process is implemented as a collection of program modules, which together form a VLIW design system. The functional components of the system include a datapath synthesizer for building the datapath, an instruction format designer for designing the instruction format, a control path synthesizer for constructing the control path, and a machine description (MDES) extractor for extracting a description suitable for re-targeting a compiler.
In some design scenarios, the system uses the extracted MDES to re-target a compiler. The re-targeted compiler schedules an application program (or set of application programs) based on the MDES and provides operation issues statistics. The system uses these statistics to select custom instruction templates that optimize the instruction format design for the application program or set of programs. The MDES may be extracted from the abstract ISA, or from a combination of the abstract ISA and the datapath. The former approach enables the system to optimize an instruction format or concrete ISA specification based on the abstract ISA specification. The latter approach enables the system to design an instruction format using both the constraints in the abstract ISA specification as well as resource sharing constraints in the datapath.
The modules in the VLIW system may be used individually or in a variety of combinations for unique VLIW processor design scenarios. In one such scenario, the datapath synthesizer takes an abstract Instruction Set Architecture (ISA) specification and programmatically generates a datapath in a hardware description language using hardware macrocells from the macrocell library. The instruction format designer then programmatically generates the processor""s instruction format. The control path synthesizer may be used to construct a hardware description of the components in the processor""s instruction unit.
In another scenario, the system extracts the abstract ISA specification from a concrete ISA specification of a VLIW processor, including the instruction format and register files specification of the processor. By extracting the abstract ISA specification, the system may then proceed to build the datapath, extract an MDES, and build a control path based on the abstract ISA specification. The system may also optimize the instruction format using operation issue statistics to select custom templates for an application program or set of applications.
In another scenario, the system extracts the abstract ISA specification from a VLIW datapath description. Using the abstract ISA, the system may then proceed to generate the instruction format, extract an MDES, and construct the control path. As above, the system may also use the MDES to select custom templates.
Further advantages and features will become apparent with reference to the following detailed description and accompanying drawings.