1. Field
The following description relates to a processor that executes instructions stored in a program memory, and more particularly to a very long instruction word processor.
2. Description of the Related Art
A very long instruction word (VLIW) machine refers to a central processing unit (CPU) architecture for exploiting instruction level parallelism (ILP). In a superscaler architecture, a processor includes a number of multiprocessing blocks. Multiple instructions of a sequence of instructions to be executed are processed simultaneously by the multiprocessing blocks. In such a parallel architecture, hardware with a complex configuration is required to control scheduling of instruction execution.
In a VLIW approach, a compiler (i.e., software outside of the processor), schedules instruction execution. As a result, the instruction execution schedule in the processor is fixed. Therefore, the complex hardware for control may be simplified.
An instruction bundle of a VLIW machine includes instructions to be executed simultaneously by multiprocessing blocks inside. The number of instructions to be executed in parallel may be smaller than the width of a VLIW instruction by virtue of such factors as the restriction of ILP. In this case, “no operation” (NOP) instructions fill each empty instruction slot. For memory efficiency, the regions containing NOP instructions are compressed when an instruction bundle is stored. The compression is accomplished by storing a stop bit together with the instructions, where the stop bit indicates the presence of NOP. The stop bit is used to determine the instructions to be executed in the subsequent clock cycle and also is used for calculating the next program counter. However, since a stop bit is read from a memory, the stop bit only may be determined after a memory read latency has lapsed. During a single clock cycle, most of time is used to determine a value of the stop bit by the memory. Due to the time consumed by reading the stop bit, a clock cycle may be lengthened, or in some cases, it might be necessary to add an additional clock cycle for each instruction fetch cycle to avoid such lengthening. These changes in clock cycles act as bottlenecks that restrict the clock speed of a VLIW machine.