1. Field of the Invention
This invention relates to digital signal processors and particularly to dual-threaded, asymmetric parallel processing systems which include a general purpose processor and a vector processor for manipulation of vector data.
2. Description of Related Art
A variety of digital signal processors (DSPs) are used in multimedia applications such as coding and decoding of video, audio, and communications data. One type of digital signal processor (DSP) has dedicated hardware to address a specific problem such as MPEG video decoding or encoding. Dedicated hardware DSPs generally provide high performance per cost but are only usable for specific problems and unable to adapt to other problems or changes in standards.
Programmable DSPs execute programs which solve multimedia problems and provide greater flexibility than dedicated hardware DSPs because changing software for a programmable DSP can change the problem solved. A disadvantage of programmable DSPs is their lower performance per cost. A programmable DSP typically has an architecture similar to that of a general purpose processor and a relatively low processing power. The low processing power generally results from an attempt to minimize cost. Thus, such a DSP is not a wholly satisfactory because a low power DSP hampers the DSP""s ability to address the more complex multimedia problems such as real-time video encoding and decoding.
Since a goal for a programmable DSP is to provide high processing power to address multimedia problems at a minimum cost, one could incorporate into such a DSP parallel processing, which is one known way to increase processing power. One architecture for parallel processing is a xe2x80x9cvery long instruction wordxe2x80x9d (VLIW) DSP, which is characterized by a large number of functional units, most of which perform different, but relatively simple tasks. A single instruction for a VLIW DSP may be 128 bytes or longer and has separate parts. The parts can be executed by separate functional units in parallel. VLIW DSPs have high computing power because a large number of functional units can operate in parallel. VLIW DSPs also have relatively low cost because each functional unit is relatively small and simple. A problem for VLIW DSPs, however, is inefficiency in handling input/output control, communication with a host computer, and other functions that do not lend themselves to parallel execution in the functional units of the VLIW DSP. Additionally, programs for VLIW differ from conventional computer programs and can be difficult to develop because of lack of programming tools and programmers familiar with VLIW software architectures.
In accordance with the invention, an integrated digital signal processor is disclosed. The digital signal processor combines a general purpose processor with a vector processor, which is capable of operating in parallel with the general purpose processor. The integrated digital signal processor is able to achieve high performance with low cost since the two processors perform only tasks ideally suited for each processor. For example, the general purpose processor runs a real time operating system and performs overall system management while the vector processor is used to perform parallel calculations using data structures called xe2x80x9cvectorsxe2x80x9d. A vector is a collection of data elements typically of the same type.
In one embodiment, the digital signal processor also includes a cache subsystem, a first bus, and a second bus. The first bus is used for high speed devices such as a local bus interface, a DMA controller, a device controller, and a memory controller. The second bus is used for slow speed devices such as a system timer, a UART, a bit stream processor, and an interrupt controller.
The cache subsystem combines caching functions with switchboarding, or data routing, functions. The switchboard functions allow multiple communication paths between the processors and buses to operate simultaneously. Furthermore, the cache portion of the cache subsystem allows simultaneous reads and writes into the cache memory.