1. Field of the Invention
The present invention relates to the field of integrated circuit design, specifically to the integration of peripheral components and macro functions with a central processing unit (CPU) or user-customizable microprocessor.
2. Description of Related Technology
As semiconductor processing capabilities increase the number of transistors that can be economically built on a single Integrated Circuit (IC), systems designers are made less effective by the difficulty encountered in combining large-scale macro blocks on a single IC. Such large-scale macro blocks (or “macro functions”) include, for example, those associated with third generation (“3G”) communications architectures, such as functions performing Viterbi butterfly decode, cyclic redundancy checks (CRC), convolutional encoding/decoding, permutation, and carrier modulation/demodulation. Some of the problems encountered by the designer are underscored by the need to integrate special purpose functions with an existing instruction set implemented by a given central processing unit (CPU). Often, a non-integrated design approach is employed, wherein the large-scale macro blocks or functions are treated as separate entities from the processor core, thereby requiring additional complexity, as well as specialized or unique interfaces between the core and its associated functions which are not standardized across the device. Specifically, with respect to memory interfaces, the use of control registers associated with the memory ports of the interface not only complicates the design, but also may under certain circumstances limit or restrict the functionality of the interface. For example, individual macro blocks associated with the design may be precluded from acting on data in separate memory banks simultaneously, thereby hindering the performance of the design as a whole by requiring that memory accesses be performed in “lock-step” fashion.
Prior art treatment of large-scale macro functions as separate entities within the design has further disabilities relating to memory. In particular, since the macro block is effectively a separate entity from the core, memory interfaces to existing core memory are often quite complex, thereby often necessitating the provision of separate memory dedicated to the macro function (or shared between multiple macro functions). The requirement for such additional memory adds cost and complexity to the device, as well as monopolizing already precious real estate on the die. This is especially true for so-called “system-on-a-chip” (SoC) designs, where available memory is often a limiting parameter. Additionally, such dedicated “off-core” memory is by definition not local to the core, and hence results in increased latency when such memory must be accessed by the core.
Furthermore, as more such large-scale macro function blocks are added to the design, the propensity for such increased complexity and non-standardization across the design increases accordingly.
Furthermore, conventional interface mechanisms are typically based on a common bus, and transfers between peripherals and the core(s) are arbitrated by one or more direct memory access (DMA) controllers. However, under such an approach, the timed transfer of data may not be deterministic, which is often a crucial requirement for DSP applications. Specifically, DSP systems often require not only that data are processed correctly mathematically, but that results are delivered at the right time. In this sense a “deterministic” transfer is one for which the timing is exactly known.
Based on the foregoing, there is a need for an improved apparatus and method for enabling macro functions and peripherals present on an integrated device to interface with the device processor core in a simple and standardized manner. Such improved interface would not only allow for standardized interface between macro-functions across the device, but also allow multiple macros to interface with individual (or a plurality of) memory banks simultaneously. Such improved apparatus and method would also ideally obviate separate or discrete local memory now used in support of macro (e.g., DSP core) functions, and facilitate deterministic transfer of data between functional entities in the design.