The basic architecture of a wide range of digital signal processors having embedded numerical processor accelerators has been labeled the ‘megastar’ architecture. FIG. 1 illustrates the block diagram of the ‘megastar’ architecture. The central processor unit 100 is linked by busses 117 to internal memory including ROM 102, single-access SARAM 103, and dual-access DARAM 104 by way of memory interface unit (MIF) 105. MIF also links the processor and internal memory to external memory bus 115 via DMA 107 and external memory interface 106. The peripheral interface unit (PIF) 108 provides a bus 110 to a number of external peripherals as denoted by ports 111, 112, 113 and 114.
The numerical co-processor unit 101 communicates exclusively with the main CPU 100 via bus 109. This numerical co-processor could be a co-processor crafted for efficient processing of floating point data. Alternatively it could be of special purpose design to do other specific functions, encoding or decoding of complex data for example. Co-processors, such as block 101 are included to speed-up computations and are referred to as ‘accelerators’.
The 48-bit single cycle fetch of the conventional ‘megastar’ architecture is illustrated in FIG. 2. The CPU 201 is programmed to fetch 48-bits through memory interface unit (MIF) 203. This fetch is accomplished in three busses 220, 221 and 222, composite busses containing address, data and control signals. The CPU sends addresses over these three busses and receives ‘read’ data in return. Memory is organized as 32-bit wide data banks, illustrated as combined SARAM and DARAM banks in FIG. 2 by Bank-0204, and Bank-1205. Memory bank-0204 is connected to MIF 203 via bus 214. Memory bank-1205 is connected to MIF 203 via bus 215.
Because memory in the megastar architecture is composed of 32-bit word banks, the MIF 203 in the ‘megastar’ architecture handles the 32-bit to 16-bit translation and appropriate address decoding to fetch the desired 16-bit ‘half-word’ at each of the composite busses 220, 221, and 222. The MIF 203 can pass to the CPU any three half words addressed by busses 220, 221, and 222. Two of these busses can be used to select both the upper and lower half-words of a single 32-bit word stored in the memory banks 204 and 205.
Summarizing, the three 16-bit read busses 220, 221, and 222 allow for 48-bits to be fetched on a single clock cycle. While this type of fetch is adequate for a variety of applications, it is often desirable to fetch a full 64-bits of data (two 32-bit words) on a single clock cycle to achieve maximum performance in the accelerator.
The processor configuration illustrated in FIG. 2 may alternatively, in one cycle, fetch a 32-bit word, for example using busses 220 and 221. That same cycle can fetch another 16-bit half-word that is either the upper or lower 16-bits of a second 32-bit word, for example using bus 222. A second fetch must be added to obtain the remaining half-word of the said second 32-bit word.