1. Field of the Invention
The present invention relates to a data processing device, and more particularly to a data processing device that performs computational operations with a system of multiple processors.
2. Description of the Related Art
Multiprocessor systems employ a plurality of processing elements to perform computational operations. Because of limitations in the number of memory devices that can be mounted, as well as to meet the cost requirement, many of such multiprocessor systems use an architecture where processors exchange data with each other through a single shared memory. While the shared data memory may receive two or more concurrent access requests, this system architecture allows only one processor to reach the memory at a time. A bus arbitration mechanism is thus employed in the system to resolve such memory access conflicts and serve the requesting processors in an orderly fashion.
One typical arbitration method is known as a rotating priority scheme, in which all processors are served with equal priority. Another typical method is a fixed priority scheme, in which each processor is assigned a different priority level for determining the order of data memory access when a contention occurs. Those arbitration methods are, however, inflexible in terms of access priority manipulation. Also, even a high-priority processor has to wait until the end of an ongoing memory access session that is performed by another processor with a lower priority. This could produce frequent “stalls” (i.e., the state where the processor is unable to execute the next instruction) if many access requests were concentrated to the data memory.
Yet another bus arbitration method is proposed in Japanese Patent Application Publication No. 6-309276 (1994), paragraph Nos. 0006 and 0007 and FIG. 1. To control bus requests, the proposed method uses a variable ID code that changes each time a new bus request signal is issued. This method, however, provides no specific solutions for the problem of stalls (e.g., by reducing memory access cycles). It also lacks the ability to shorten the duration of memory access, thus leading to inefficient data processing.
Many multiprocessor systems have a cache memory for each processor to enable high-speed access to program instructions. That is, the instruction codes stored in a relatively slow, large-capacity code memory are transferred to a small fast cache memory, so that the processor can read and execute them at a high speed. This memory-to-memory transfer operation is referred to as a program loading operation, which occurs upon “cache miss,” i.e., when a required code is missing in the cache memory. That is, the conventional multiprocessor systems trigger a program loading process on an after-the-fact basis, lacking mechanisms for reducing the chance of cache miss. This leads to an increased time of instruction fetching and consequent performance degradation.