Conventionally, for example, as disclosed in Japanese Patent Application Laid-Open Publication No. 4-369733, it is often a case that a processor which uses error correction code for data protection needs a plurality of cycles with respect to a clock of the processor to perform calculation of error correction code.
For example, when some data (for example, of 128 bits) stored in a memory or a cache memory is updated, calculation of error correction code (for example, of 9 bits) is performed for the whole updated data (128 bits), and the updated data and the error correction code (hereinafter also referred to as ECC) are written in the memory at the same time. In this case, a time for a plurality of cycles is required for the calculation of the ECC.
There is also a case where a part of some data (that is, data less than 128 bits) is updated. In this case, for example, whole data (128 bits) already stored in a memory or the like is read, a whole (128 bits) of the read data in which update data is reflected is generated, an ECC for the whole data is calculated, and the updated data and the ECC are simultaneously written in the memory.
In other words, although 128-bit data already stored does not need to be read if data to be written is 128-bit data, there is also a case where a part of data is updated in data update processing. Therefore operation cycles need to be reserved for three processes: reading, ECC calculation, and writing. For example, if address calculation, reading, ECC calculation, and writing require two cycles, one cycle, two cycles, and one cycle respectively, data update processing, that is, data store processing necessarily requires six cycles. In the store processing, these six cycles are a processing sequence having a longest processing time, which is a so-called critical path.
On the other hand, in a case of processing for only data reading, that is, data load processing, if address calculation requires two cycles and reading requires one cycle, the data read processing requires three cycles.
In a case where there is a difference in the number of cycles between the store processing and the load processing, there is a problem that when reading of update data is performed immediately after data update, data before being updated is read from the memory or the like. To prevent reading of incorrect data due to the difference between the store processing and the load processing of such data as described above, a so-called forwarding circuit is provided.
The forwarding circuit is a circuit configured to forward subsequent load processing by cycles corresponding to the number of written data in preceding store processing. The forwarding circuit is configured such that if there is reading of data immediately after processing for writing data in a memory or the like at the same address as of the reading, the data is read from a plurality of registers which hold data written a plurality of cycles before, instead of from the memory or the like. To this end, the forwarding circuit has a plurality of address comparator circuits corresponding to a plurality of cycles.
For example, if a difference between data read processing and data update (i.e., write) processing is three cycles, the forwarding circuit has respective addresses of the previous three cycles and three address comparator circuits for comparison with a present read address, and further has three registers configured to hold written data corresponding to the three cycles.
Then, if address data has a large bit width such as 64 bits, signal lines corresponding to the number of bits of the address data are connected to the respective comparator circuits for comparison, and signal lines having a large bit width such as 128 bits are connected to the respective registers for output. As a result, there is a problem that a size of the forwarding circuit becomes larger, and an area of the forwarding circuit also becomes larger on a semiconductor chip implementing a CPU.