1. Field
Apparatuses and methods consistent with exemplary embodiments relate to instruction set architecture (ISA) on a computer architecture, and particularly, to an apparatus and method for compressing an instruction for a very long instruction word (VLIW) processor, and an apparatus and method for fetching an instruction.
2. Description of the Related Art
A very long instruction word (VLIW) processor includes a plurality of functional units that execute a plurality of instructions in parallel. Instructions input to the VLIW processor may be grouped into instruction bundles, each including a number of instructions corresponding to the number of functional units, and the instruction bundles are simultaneously executed in the plurality of functional units. A VLIW processor may reduce the time to execute all input instructions by distributing the input instructions among the plurality of functional units.
Theoretically, the maximum number of instructions that can be simultaneously executed by the VLIW processor is the same as the number of functional units. However, due to dependency among instructions, the number of valid instructions that can be executed in parallel at each execution period may be smaller than the number of functional units. For example, due to a failed operation outcome production from a previous instruction, some or all functional units may not process any instructions at a particular time. Such functional units that do not process any instructions are allocated a No-Operation (NOP) instruction.
As a result, an increase in the total number of instructions due to the NOP instructions, unnecessary to the VLIW processor, may deteriorate processor performance. In particular, the increased number of instructions utilizes a larger memory, which may increase a probability of a cache miss, resulting in a slow system speed. In addition, there may be instruction fetch overload due to a large number of instructions.
A method of compressing instructions and storing the compressed instructions has been researched in an effort to prevent the performance deterioration of a VLIW processor. For example, there has been introduced a method of generating a separate operation for removing NOP bundles from all issue slots of a VLIW processor. The separate operation may include information that indicates the number of successive cycles a NOP bundle is to be performed, and a P bit that indicates whether a subsequent operation can be executed in parallel. The separate operation, however, is to be allocated a code, and the information that indicates the number of successive cycles a NOP bundle is to be performed is to be stored in a registry file, which may lead to the reduction of a clock speed in terms of instruction fetch.