1. Field of the Invention
This invention relates to the field of processing systems suitable for multimedia processing. More specifically, this invention relates to system having a programmable, modular, accelerated multimedia processor.
2. Description of the Related Art
Digital multimedia systems require a substantial digital signal processing capability. This requirement is shared by many other digital systems including image rendering systems, artificial vision systems, digital communication systems, and speech recognition systems. The typical architecture for such systems is shown in FIG. 1.
FIG. 1 shows a microcontroller bus 102 which couples a microcontroller 104 to a microcontroller memory 106. A digital signal processor (DSP) 108 is similarly coupled to a DSP memory 110 by a DSP bus 112. The two busses are coupled by a bus bridge 114.
This architecture is popular since the microcontroller 104 can assume the responsibility for system-level functions (such as controlling a user interface, initiating and terminating operation of various system modules, and coordinating data transfers), and the DSP 108 can assume the responsibility for computationally-intensive tasks (such as various coding and compression algorithms, filtering operations, and data transforms). This division of labor eases system design and programming.
However, this architecture is inadequate for future generations of digital multimedia systems. The processing requirements are being increased as designers take advantage of compression algorithms and higher bandwidths to transmit more information. For example, new premium services have been proposed for "Third Generation (3G) Wireless" applications. Third generation wireless refers to international standards defined by the Universal Mobile Telecommunications System (UMTS) committee and the International Mobile Telecommunications in the year 2000 (IMT-2000) group. Third generation wireless applications support bit rates from 384 KB/s to 2 Mb/S, allowing designers to provide wireless systems with multimedia capabilities, superior quality, reduced interference, and a wider coverage area.
To a small extent, the processing capacity of a DSP can be increased through the development of new algorithms and careful optimization of software. However this requires a substantial investment of time and resources, for an indeterminate payoff. A more pragmatic solution is to use a more powerful DSP.
A more powerful DSP can be created in two ways. The clock speed can be increased, but this requires careful optimization and redesign of the DSP for every incremental improvement in semiconductor processing technology. Alternatively, the DSP can be provided with wider data paths, e.g. an 8-bit DSP could be replaced with a 32-bit DSP. However, the increases in the required area and power consumption are quadratic (i.e. to double the data path width, the area and power requirements increase by approximately a factor of four). This alternative is undesirable since power consumption is a perennial design constraint, particularly in view of the increasing popularity of portable devices.
Furthermore, larger data path widths are likely to be a poor "fit" for the data granularity, leading to inefficient use of the more powerful DSPs. For example, MPEG video compression operates with 8-bit blocks of video data. Even if multiple blocks were retrieved at a time, the DSP could only perform (at most) one 8-bit block operation per clock cycle. The rest of the data path width is unused for these operations.
To address these problems, this architecture may be modified by the addition of a dedicated, hardwired (non-programmable) accelerator that is custom-designed to efficiently and quickly carry out specific algorithms. The accelerator may be coupled to the DSP 108 and the DSP memory 110 via the DSP bus 112. The DSP 108 then performs the less demanding computationally-intensive tasks of pre-processing and post-processing the data, and allows the accelerator to perform the processing steps that the DSP 108 is too inefficient to perform.
Most hardwired accelerators are tightly coupled to the DSP as an integral part of the system. It is often difficult to integrate such accelerators into other systems without significant modification.
Further, while fast and efficient, hardwired accelerators are optimized for specific tasks. Multiple tasks would require multiple accelerators. This represents an undesirable cost (in terms of area and power consumption) for next generation multimedia systems which may be expected to perform a wide variety of multimedia processing tasks. For example, the 3G wireless communications system may require support for echo cancellation, voice recognition, high-quality sound and graphics generation, audio and video compression, error correction coding, data encryption and scrambling, and transmit signal modulation and filtering. Support for demodulation, descrambling, decryption, decoding, and decompression would also be required for received signals. A programmable multimedia accelerator optimized for these operations would provide a desirable alternative to an expensive, high-performance DSP or a large collection of hardwired accelerators.