The present invention relates to the provision of instruction streams to a processing device. In preferred embodiments, the invention relates to a method of expanding the instruction stream available to a processing device and thereby enabling a reduction in instruction size.
In general, programmable devices have their operation controlled by a stream of instructions. Such streams are generally termed instruction streams. Such programmable devices include, but are not limited to, microprocessors. Each instruction within a stream will typically be a pattern of bits of a predetermined length: termed an instruction word. Each pattern of bits is an encoding which represents a particular instruction to the programmable device. For most programmable devices, operations are controlled on a cycle-by-cycle basis: while this is normally true, there are some programmable devices which cannot be meaningfully described as controlled on a cycle-by-cycle basis, such as some types of field programmable gate array (FPGA). Field programmable devices are of particular interest for certain embodiments of the invention, though the examples described will show cycle-by-cycle control.
The encoding of instructions is a compromise between several factors. Firstly, it is desirable for a large number of different operations to be encodable, so that a rich functionality is available to the programmer of the device. Secondly, it is desirable for decoding of instructions to be easy: that is, for relatively little circuitry to be required to convert the external instruction into the required internal control signals. Both these factors lead towards a large number of bits in each instruction word. However, the third factor is that it is generally desirable to have a small number of bits in each instruction word: otherwise large quantities of time and circuit space will be consumed to accommodate the broad data channel required.
One area where these tensions in satisfactory instruction handling are particularly apparent is in RISC (Reduced Instruction Set Computer) processing design. RISC involves limited instruction sets handling a simplified set of instructions, as opposed to the instructions of CISC (Complex Instruction Set Computer) design prevailing up until the mid 1980s: in CISC design, it has generally been considered desirable to specify an instruction for each useful eventuality. General microprocessor design has moved towards RISC design in order to increase speed (as the processing units can be simple, as they are dealing with limited instructions) and to reduce cost (as RISC designs generally require fewer transistors than equivalent CISC designs). However, as RISC lacks the richness of instruction choice present in CISC, code written for RISC processors tends to be considerably longer than code written for CISC processors. In this respect, RISC processors have a disadvantage over CISC processors.
This disadvantage can be more than obviated by providing a rich instruction set with small instruction size. Reducing instruction size is advantageous, as it reduces the overall memory-to-processor bandwidth for the instruction path, and may also reduce the amount of memory to store the program (which may be significant in embedded applications in particular). One approach to reduction of instruction size is xe2x80x9cThumbxe2x80x9d architecture of Advanced RISC Machines Limited (ARM), described for example in the World Wide Web site http:/www.dev-com.com/xcx9criscm/Pro+Peripherals/archExt/Thumb/Flyer/ and in U.S. Pat. No. 5,568,646. The ARM processor is a 32-bit processor, with a 32-bit instruction set. The Thumb instruction set comprises a selection of the most used instructions in this 32-bit instruction set, which is then compressed into a 16 bit form. These 16 bit instructions are then decompressed at the processor into 32-bit code. This solution does allow the use of a 16-bit instruction path for a 32-bit processor, but requires additional complexity in the instruction pipeline and relies on reducing the instruction set to a selected group of instructions.
It is therefore desirable to find an alternative approach to optimising the provision of instructions to processing devices, so that rich functionality and ease of decoding can be achieved at a reduced instruction size.
Accordingly, the invention provides a circuit for providing an instruction stream to a processing device, comprising: an input to receive an external instruction stream for provision of a first set of instruction values; a memory adapted to contain a second set of instruction values; two or more outputs for provision of output instruction streams to the processing device, a control input; and a selection means adapted to distribute the first set of instruction values and the second set of instruction values between the two or more outputs according to the control input.
In this context the term xe2x80x9cprocessing devicexe2x80x9d is used for essentially any processing element with a capability to accept instructions and perform an information processing function: this clearly includes elements such as CPUs, but also includes processing elements contained within a field programmable array. An example of the application of the invention to such a structure is provided below.
The use of a second set of instructions allows the functonality of the instruction set available at the word length provided in the external instruction stream to be enhanced. Advantageously, it can allow xe2x80x9cexpansionxe2x80x9d of the instruction word, such that the output instruction streams together contain more bits than the external instruction stream. Alternatively, it can allow bits to be diverted from the instruction stream to drive peripheral circuitry for the processing device, which may in itself provide an effective expansion to the instruction set. This peripheral circuitry can be used for a range of functions: an example is to enable or disable data inputs to the processing device.
In a preferred arrangement, the selection means provides for a bitwise selection of valuers between the first set of instruction values and the second set of instruction values, wherein for each selection of a value one bit from either the first set of instruction values and the second set of instruction valued is directed to one of the two or more output, and a corresponding bit from the other of the first set of instruction values and the second set of instruction values is directed to another of the two or more outputs. In this arrangement, the second set of instruction values may be provided as a variable, but in advantageous embodiments it will be provided as one or more constants (for example, a value defined before the start of the external instruction stream, perhaps at device configuration in the case of a configurable or reconfigurable device).
A further useful feature which can improve utilisation within a larger circuit is the use of means to disable either the provision of instructions from the external instruction stream or from the second set of instruction values: these features can reduce program difficulties by allowing one or other device function to be xe2x80x9cignoredxe2x80x9d.
While this approach is effective for use with a processor device which has a datapath width which is the same for both instructions and data, and for which register use is specified independently from instruction function (as is generally the case with RISC processors), it also has clear advantages in other forms of processor design where similar problems exist. The application of the present invention will be discussed not only with respect to RISC design, but also with regard to the design of field programmable devices containing a plurality of processor elements.
A particularly relevant form of field programmable device for application of the invention is one in which the plurality of processing devices are connected to one another by a configurable wiring network, and in which the processing devices are (or comprise) ALUs, especially relatively small ALUs (such as 4-bit ALUs).