In The fields of system LSIs and microcontrollers, the introduction of a multi-core structure that a plurality of CPUs are implemented in a chip has been progressing for the purpose of simultaneously achieving the suppression of increase in power consumption stemming from the rise in the frequency of clocks, and the enhancement of the performance. In parallel with such trend, in design of system LSIs, we are pressed with the urgent need to select to provide a memory devoted solely to each CPU or to arrange two or more CPUs to share a common memory.
In a distributed memory system that a dedicated memory is provided for each CPU, and the CPU is connected with memory through a low-latency bus, the CPUs run different programs. Therefore, memory access and data transfer on the low-latency bus never conflict with each other. On this account, distributed memory systems have an advantage that programs high in parallelism can be run with a low latency. However, in a distributed memory system, a program and data, which are handled by the whole LSI, are distributed and stored in individual memories. Therefore, it is considered that the finite memory capacity cannot be leveraged effectively, and an unused memory region tends to arise.
In contrast, as for shared memory system that a memory is shared by CPUs, it is conceivable that one memory region can be used effectively, however the conflict between accesses by CPUs to the shared memory, and the conflict of data transfer on a bus tend to occur. Therefore, the increase in the latency in connection with data transfer inside a chip can cause the degradation of performance.
On this account, as supposed in JP-A-2001-306307, it is required for a shared memory system to reduce the latency resulting from the conflicts in connection with memory access and bus use while effectively leveraging a memory region.
Now, as to a multi processor system, during execution of a system setup initial diagnosis in accordance with an initial diagnosis program (hereinafter referred to as “firmware” or its abbreviation “FW”), CPUs on two or more modules read a diagnosis program on a non-volatile memory common to the CPUs, which is a diagnosis module, concurrently, and thus the system performance is deteriorated.
In accordance with JP-A-2001-306307, to reduce the conflicts in connection with memory access and a bus shared in the system during execution of FW, a FROM (slave non-volatile ROM) is placed on each processor module, and the FW has been saved in FROM in each processor module in advance. These means make smaller the logical and physical distances between each CPU and the diagnosis program, thereby lowering the latency owing to an initial diagnosis process and shortening the time of system setup. Therein, a program which has been checked about the coherency of FW per se is used as a master program, and FW on FROM in each processor module and FW on FROM in the diagnosis module are checked against each other before execution of FW on FROM in each processor module. Specifically, before execution of diagnosis, FW on FROM in the diagnosis module SVP is compared in version with FW on FROM in each processor module. Then, a non-volatile memory with FW of a newer version stored therein is disposed as a master ROM in an address space. The process of making such comparison of version and decision can be performed by just comparing data of at most several bytes. Therefore, it is possible to reduce the number of instruction fetches through a diagnosis path which latens the speed of access.
Thus, in case that the coherency of FW on FROM in each processor module is not checked, an additional process of updating FW is required, whereas in case that the coherency is checked, the diagnoses of CPU and cache memory can be performed, in parallel, with low latency without using a system shared bus and a diagnosis path.
However, in cases of multi core microcontrollers, the application of non-volatile memories is not limited to only such initial diagnoses. As for multi core microcontrollers with shared memories, a user program and data for control are stored in an on-chip non-volatile memory, which are to be read by CPUs. Basic processes during the time of running a user program are instruction fetch from a non-volatile memory, instruction decode and instruction execution by each CPU. Further, depending on the type of an instruction, read from a non-volatile memory into a register in CPU is conducted frequently. For this reason, in regard to not only the system setup, but also read of a non-volatile semiconductor memory as a normal operation, it is required to lower the latency.
On top of that, microcontrollers are required to achieve high performance, high reliability and a low cost. In the field of microcontrollers, it is not advisable in chip area and cost to arrange a non-volatile memory exclusively for each CPU in addition to a shared non-volatile semiconductor memory which CPUs can read as in JP-A-2001-306307. What is essential to achieve a high performance and a low cost in regard to microcontrollers simultaneously is a technique to materialize non-volatile semiconductor memories which can be read with low latency even with read requests from CPUs conflicting in spite of adopting the shared memory system.
In general are often adopted means such as disposing a cache memory with a low latency between a non-volatile semiconductor memory and CPUs, and using a hierarchical memory to lower the latency. However, in fields that high reliability is required particularly, such as the field of automotive control, it is required to ensure not only peak performances, but also performances in nearly worst cases including a case of a low cache hit ratio. From these points, it is an essential matter to lower the latency of a non-volatile semiconductor memory per se.
Therefore, it is an object of the invention to provide a technique for materializing a low latency access even in case of occurrence of conflict between access requests from CPUs.
The above and other objects and novel features of the invention will be apparent from the description hereof and the accompanying drawings.