Many mobile communication devices use a radio transceiver that includes one or more digital signal processors (DSP).
For increased performance and reliability many mobile terminals presently use a type of DSP known as a baseband processor (BBP), for handling many of the signal processing functions associated with processing of the received the radio signal and preparing signals for transmission.
Many of the functions frequently performed in such processors are performed on large numbers of data samples. Therefore a type of processor known as Single Instruction Multiple Data (SIMD) processor is useful because it enables the same instruction to be performed for a whole vector of data rather than on one integer at a time. This kind of processor is able to process vector instructions, which means that a single instruction performs the same function to a limited number of data units. Data are grouped into bytes or words and packed into a vector to be operated on.
As a further development of SIMD architecture, Single Instruction stream Multiple Tasks (SIMT) architecture has been developed. Traditionally in SIMT architecture one or two vector execution units using SIMD data-paths have been provided in association with an integer execution unit which may be part of a core processor.
International Patent Application WO 2007/018467 discloses a DSP according to the SIMT architecture, having a processor core including an integer processor and a program memory, and two vector execution units which are connected to, but not integrated in the core. The vector execution units may be Complex Arithmetic Logic Units (CALU) or Complex Multiply-Accumulate Units (CMAC). The data to be processed in the vector execution units are provided from data memory units connected to the vector execution units through an on-chip network.
In large multi-core systems it is difficult to affect the partitioning and to plan the resource requirements ahead. To increase the flexibility it would be useful to enable a processor to borrow resources from another digital signal processor. This may be done in the prior art by performing a remote procedure call. This involves transferring data to a memory of the other processor and requesting execution of a function by the other processor. The resulting data must then be transferred back to a memory of the first digital signal processor. This occupies a considerable amount of control capacity in the second digital signal processor, which is inefficient.
An alternative solution, which is common in digital signal processors is to let a number of processors share one memory that can be accessed by all processors. The memory may be a data memory, a program memory or a combined data and program memory. Memories that can be accessed from several processors are expensive and difficult to handle in terms of cache arbitration. They become unpredictable and difficult to synchronize.