Multiprocessor systems are roughly classified into two, a system designed to execute symmetric multiprocessing (SMP) in which each processor is homogeneous and a system designed to execute asymmetric multiprocessing (AMP) in which each processor is heterogeneous.
Here, the latter heterogeneous multiprocessor system in general employs a control method of directly controlling execution of other plurality of sub-processors by a main processor (MP).
In this method, the main processor which manages the entire system to execute main processing also executes activation control of each of functionally-distributed sub-processors.
On this occasion, control of each sub-processor is realized by using a system bus whose access right is held by the main processor, by which bus a notification of processing completion from each sub-processor is made after input of each interrupt signal from each sub-processor to check a state of each sub-processor by using the system bus.
Such a control method as described above has advantages of facilitating consideration and implementation of a control sequence of the entire multiprocessor system and having high observability for debugging because state management of the entire system and state management of each sub-processor can be aggregated in one place.
This, however, has a problem of a processing failure because late increase in scale of system LSIs and complication of the same cause processing loads to be centralized in the main processor.
The technique having a parallel-execution dedicated execution control unit is proposed as related art such as that disclosed in Patent Literature 1 in order to solve the problem.
The related art recited in Patent Literature 1 is enabling an increase in the rate of operation due to decentralization of main processor loads and parallel execution of the respective sub-processors by implementing a control mechanism which operates at least two sub-processors in parallel in a pipeline fashion.
The execution control unit has a circuit configuration dependent on the number of sub-processors to be connected such as input of a completion notification signal. In this case, it is a common practice for late system LSIs to have a plurality of sub-processors (IP cores) integrated on one chip, so that demanded is system expansion in a short TAT (Turn Around Time), that is, an increase in the number of cores. There accordingly occurs a need of changing a circuit configuration of the execution control unit or a capacity of an internal command table each time, resulting in having low expandability. Another problem is that such complicated execution control processing as includes three or more processings dependent on each other is difficult to realize and has low flexibility.
Proposed as another related art is such a technique as disclosed in Patent Literature 2 is a technique which enables improvement in flexibility and expandability of an execution control circuit dedicated to parallel execution control.
The related art recited in Patent Literature 2 achieves load decentralization of execution control processing and improvement in expandability and flexibility at the same time by providing the execution control circuit with an execution control processor, a status bus input unit for checking a processing status from each sub-processor, and a status FIFO whose capacity is variable.
This technique, however, requires a processing time on the order, for example, of several tens to several hundreds of cycles for execution control processing with respect to one processing status because execution control processing of each sub-processor is executed by software processing on the execution control processor.
Another problem is that since expansion of the number of sub-processors or the number of processing statuses directly leads to an increase in the amount of execution control processor processing, such a risk is increased as processing failure of the processor or the need of an increase in the number of execution control circuits.
Proposed as other related art is such a technique as disclosed in Patent Literature 3 which enables speed-up of sub-processor control by a main processor by using an instruction buffer and a response buffer.
According to the related art recited in Patent Literature 3, spontaneous read of an instruction buffer by an idle sub-processor eliminates the main processor's need of recognizing which sub-processor is idle, thereby speeding up sub-processor control.
The technique is, however, premised on homogeneous symmetric multiprocessing (SMP) in which each sub-processor is capable of processing any task, so that it is in the first place not applicable to a multiprocessor system having heterogeneous asymmetric multiprocessing (AMP) targeted by the present invention.
Similarly to the related art recited in Patent Literature 2, because of software processing by the main processor, the technique requires a processing time on the order of several tens to several hundreds of cycles for execution control processing of one task even when speed-up of the processing is realized.
Patent Literature 1: Japanese Patent Laying-Open No. 2003-208412
Patent Literature 2: International Publication WO2010/016169A1
Patent Literature 3: Japanese Patent Laying-Open No. H09-218859
The first problem to be solved is that when execution control of each sub-processor is realized in a multiprocessor system by using not a main processor but a dedicated execution control processor (CP), its execution control processing time (latency) might cause a processing failure of the multiprocessor system as a whole.
The reason is that with a further demand for speed-up of late image processing system LSIs or communication processing system LSIs, a processing time required for execution control between the respective sub-processors exerts an adverse effect of increasing a possibility that entire processing will not be finished in time. For a data processing system where data is processed in a pipeline fashion by sequentially using the respective sub-processors, in particular, it will be crucial in the future how much time overhead in processing of the respective sub-processors can be reduced.
When execution control processing by an execution control processor is realized as a whole in dedicated hardware to speed up processing, for example, reduction in flexibility and expandability will occur as a second problem which will be described later.
Second problem is low flexibility and expandability of a multiprocessor system as a whole when implementing execution control dedicated hardware.
The reason is that an interface and a circuit configuration of an execution control unit are in general liable to depend on the number of sub-processors connected or the number of processing statuses. In other words, when increasing the number of sub-processors or the number of processing statuses for expanding the system, circuit change is required such as change of an interface or a table capacity of the execution control unit itself.
Further problem is that it is in general difficult to realize such complicated execution control processing as includes processing by three or more sub-processors dependent on each other by using dedicated hardware including table look-up. Even when the processing is realized by using a complicated circuit configuration, if the circuit is specialized in the execution control processing in question, its flexibility and expandability will be very low.