The present invention relates to a prefetch unit for use in a computer system.
In a computer system, instructions are typically fetched from a program memory, decoded and supplied to an execution unit where they are executed to run the program stored in the program memory. If more than one execution unit is provided, it is possible to arrange for very high speed instruction execution. In order to take advantage of this, it is clearly necessary to be able to supply decoded instructions to the execution unit at a sufficient rate. Presently, access times to memory cannot match execution speeds, and therefore several machine cycles are needed to access each new instruction from memory. Thus, there can be a severe performance degradation because the fetches from memory cannot match the rate at which instructions can be executed by the execution units.
According to the present invention there is provided a prefetch buffer for holding instructions in a processor having a memory and an instruction decode unit, the prefetch buffer comprising:
a plurality of storage locations, each having the same bit capacity (2n bits) and arranged in groups with the same number p of storage locations in each group;
a write port for selectively writing words of bit length nxc3x97p from the memory into respective groups of the prefetch buffer;
read circuitry for reading instructions out of the prefetch buffer in dependence on an instruction mode of the processor, said instruction mode controlling the number of storage locations which are read during a machine cycle; and
means for indicating when all storage locations in a group have been read so that a fetch signal can be issued to fetch a next word from the memory into the storage locations of that group.
In the described embodiment, each storage location has a capacity of 32 bits, and are arranged in groups of four such that each group has a capacity for a 128 bit word read out of memory on a memory fetch. In the described embodiment, four groups of storage locations are provided in the prefetch buffer, thus allowing for up to four successive memory accesses even if the first word has not yet been either received or executed. Moreover, because the processor supports more than one instruction mode, the time which it takes to read all storage locations in a group in terms of machine cycles can vary. According to the invention, the indicating means allow for a next word to be fetched from memory when all storage locations in a group have been read, however many machine cycles that has taken. Thus, memory latency is hidden through this mechanism.
According to a first instruction mode, one storage location is read out during each machine cycle to provide a pair of 16 bit instructions to the decode unit (referred to herein as GP16 mode).
According to a second instruction mode, two storage locations are read during each machine cycle to provide two 32 bit instructions to the decode unit (referred to herein as GP32 mode).
According to a third instruction mode, four storage locations are read out during each machine cycle to provide four instructions each of 32 bits to the decode unit (referred to herein as VLIW (Very Long Instruction Word) mode).
In the described embodiment, the indicating means comprises a set of flags, each group having a flag associated therewith which is set to indicate that all storage locations in the associated group have been read so as to initiate a subsequent memory fetch.
The invention also provides a prefetch unit comprising a prefetch buffer as hereinabove defined and control circuitry arranged to monitor the indicating means and to issue a fetch signal to memory to fetch the next word into the prefetch buffer when all storage locations in a group have been read. The control circuitry can include an aligner for controlling a read pointer determining the storage locations to be read in a next machine cycle.