1. Field of the Invention
The present invention generally relates to a central processing unit and system, and, more particularly, to a central processing unit and system having a pre-fetch function.
2. Description of the Related Art
A central processing unit (CPU) reads commands and data from a main memory, and performs controls and operations in accordance with the commands and data. Generally, a processing speed of a CPU is higher than a processing speed of a main memory. Accordingly, a time period for reading commands and data from the main memory greatly influences a processing time of the CPU. In order to reduce the processing time, a high-speed memory device having a small capacity is provided between the CPU and the main memory.
A command cache is used as a high-speed memory having a small capacity. The command cache can store a predetermined number of commands that have been referred to by the CPU. Additionally, there is a prefetch queue which automatically reads a predetermined number of commands corresponding to addresses subsequent to the address of which command is being executed by the CPU.
It is recognized that a CPU tends to refer to information within a limited small range of a memory area during a certain period. In other words, the possibility that pieces of information which are frequently used exist close to each other is high. Accordingly, the above-mentioned command cache and prefetch queue are used for reading such pieces of information stored within a small range in a main memory which small range is expected to be frequently referred to by the CPU.
A description will now be given, with reference to FIG. 1, of an operation of a conventional central processing system in which a high-speed, small-capacity memory device is provided between a CPU and a main memory. FIG. 1 is a block diagram of the conventional central processing system having a pre-fetch function.
As shown in FIG. 1, the central processing system 1 comprises a CPU 10, a command cache 12, a bus controller 14 and a main memory 16. The CPU 10 comprises a prefetch queue 18, a command decoder 20 and a bus access control unit 22.
The prefetch queue 18 automatically reads and stores a command at an address subsequent to an address of which command is being read by the CPU 10. The command cache 12 stores a predetermined number of commands that are previously read from the main memory 16.
A description will now be given, with reference to FIG. 2, of an operation of the prefetch queue 18. FIG. 2 is an illustration for explaining an operation of the prefetch queue 18 in a case in which two commands are automatically read and stored. When the CPU 10 reads a command corresponding to the address “08” and stores the read command in the command decoder 20, the prefetch queue 18 stores a command corresponding to a first subsequent address (or next address) “0A” and a command corresponding to a second subsequent address “0C”. Thereafter, when the CPU 10 reads the command corresponding to the address “0A” and stores the read command in the command decoder 20, the prefetch queue 18 stores a command at a second subsequent address “0E” since the command at the first subsequent address is already stored therein.
As mentioned above, the prefetch queue 18 previously reads a command from the command cache 12 or the main memory 16 which command is expected to be executed by the CPU 10 next. Thereby, a time period from the start of reading the command by the CPU 10 until the read command is stored in the command decoder 20 is reduced. The prefetch queue 18 is effective when the execution of commands is straight forward and there are a small number of branch commands included in the series of commands.
A description will now be given, with reference to FIG. 3, of an operation of the command cache. FIG. 3 is an illustration for explaining an operation of the command cache 12. It should be noted that, in the command executed by CPU, the upper 16-bit address “0000” of each address is omitted. Additionally, in the command stored in command cache, each row indicates the upper 16-bit address at the leftmost position, and four lower 8 bit addresses are indicated in the same row. It should be noted that 32-bit data can be stored at each address, and two 16-bit commands are stored at the same address.
When the CPU 10 reads a command corresponding to the address “08”, the reading operation changes according to whether or not the command corresponding to the address “08” is stored in the command cache 12. When the command corresponding to the address “08” is stored in the command cache 12, the command corresponding to the address “08” is supplied from the command cache 12 to the prefetch queue 18 of the CPU 10. In the case of FIG. 3, the command corresponding to the address “000008” is read from the command cache 12, and supplied to the prefetch queue 18.
On the other hand, when the command corresponding to the address “08” is not stored in the command cache 12, the command corresponding to the address “08” is read from the main memory 16, and the read command is supplied to the CPU 10. Additionally, the command corresponding to the address “08” is stored in the command cache 12 at an address having the lower 8-bit address “08”. In the command cache 12, the arrangement of the lower 8-bit addresses is maintained unchanged. Thus, each address is changed by changing the upper 16-bit address. That is, for example, when the address “000008” is not present in the command cache 12 and instead the address “020008” is present, the upper 16-bit address “0200” corresponding to the lower 8 bit address “08” is changed to “0000”, and the command read from the main memory 16 is stored at the changed address “000008”.
The above-mentioned operation is performed each time the commands corresponding to the addresses “0A”, “0C”, “0E”, “10” and “12” are read sequentially. In the case of FIG. 3, the command corresponding to the address “12” is a branch command which directs the command routine to return to the command corresponding to the address “08”. Since the commands corresponding to the addresses “08”, “0A”, “0C”, “0E”, “10” and “12” are already stored in the command cache 12, there is no need to read the commands from the main memory 16. Thus, the time period from the start of reading of the commands by the CPU 10 until the commands are supplied to the command decoder 20 is reduced. The command cache 12 is particularly effective when there are many repetitions in the execution of commands.
As mentioned above, the processing time of the central processing system 1 is reduced by providing the command cache 12 and the prefetch queue 18.
However, if the prefetch queue 18 is provided on the CPU 1 side of the command cache 12 as shown in FIG. 1, the efficiency of operation of the command cache 12 may be deteriorated due to the operation of the prefetch queue 18.
A description will now be given, with reference to FIGS. 4 and 5, of a case in which the efficiency of operation of the command cache 12 is deteriorated by the operation of the prefetch queue 18. FIG. 4 is an illustration for explaining an operation of the prefetch queue 18. FIG. 5 is an illustration for explaining an operation of the command cache 12.
In the operation shown in FIG. 4, after the CPU 10 executes the command corresponding to the address “0C”, the CPU 10 executes the command corresponding to the address “08” since the command corresponding to the address “0C” is a branch command directing the routine to proceed to the address “08”. However, while the CPU 10 executes the command corresponding to the address “0C”, the prefetch queue 18 reads from the command cache 12 or the main memory 16 the address corresponding to the first subsequent address and the command corresponding to the second subsequent address, and the thus-read commands are stored in the prefetch queue 18. As a result, the prefetch queue 18 stores the commands which will not be executed by the CPU 10.
When the prefetch queue 18 reads and stores the command which will not be executed by the CPU 10, the command is also stored in the command cache 12 since the command cache 12 does not have a function to determine whether a command to be stored therein is actually used by the CPU 10.
In the case of FIG. 5, while the CPU 10 executes the command corresponding to the address “0C”, the command corresponding to the address “0E” and the command corresponding to the address “10” are supplied to the prefetch queue 18 and also stored in the command cache 12. In the command cache 12, the upper 16-bit address corresponding to the address “10” is “0200” as shown in FIG. 5. Accordingly, in order to store the command corresponding to the address “10” in the command cache 12, the upper 16-bit address “0200” is changed to “0000”. Thus, the commands corresponding to other three addresses “14”, “18” and “1C” that are presently stored in the command cache 12 are deleted. In such a case, since the commands corresponding to the address “0E” and “10” are skipped and not executed by the CPU 10, the deleted commands corresponding to the addresses “14”, “18” and “1C” are unnecessarily deleted due to the unnecessary storage of the commands corresponding to the address “10”. If the deleted commands corresponding to the addresses “14”, “18” and “1C” are frequently used commands, the efficiency of the command reading operation is deteriorated.
A description will now be given, with reference to FIG. 6, of a case in which the efficiency of the operation of the command cache 12 deteriorates due to a block transfer function. FIG. 6 is an illustration for explaining an operation of the command cache 12. According to the block transfer function, the information stored in the command cache 12 is rewritten on an individual block basis. It should be noted that a single block is defined by four words in the same row, for example, the addresses “00”, “04”, “08” and “0C” as shown in FIG. 6.
When the CPU 10 reads the command corresponding to the address “1E”, the CPU 10 determines whether or not the command corresponding to the address “1E” is stored in the command cache 12. In the case of FIG. 6, the upper 16-bit address “0200” is provided to the lower 8-bit addresses “10” to “1C”. Accordingly, the commands corresponding to the addresses having the upper 16-bit address “0000” must be read from the main memory 16.
According to the block transfer function, the command cache 12 automatically write the addresses “000010” to “00001F”. However, a next command cannot be read until the block transfer operation is completed. In this case, the commands corresponding to the addresses “000010 to 00001A” are not executed. Accordingly, the CPU 10 cannot read the command corresponding to the next address “000020” until the writing operation for the commands corresponding to the addresses “000010 to 00001A” is completed in the command cache 12. Thus, the efficiency of operation for reading commands from the command cache 12 deteriorates.
As mentioned above, there is a problem in that there are some cases in which the command cache 12 and the prefetch queue 18 cannot be efficiently operated in combination.