This invention relates to data processing devices, electronic processing and control systems and methods of their manufacture and operation.
Generally, a microprocessor is a circuit that combines the instruction-handling, arithmetic, and logical operations of a computer on a single semiconductor integrated circuit. Microprocessors can be grouped into two general classes, namely general-purpose microprocessors and special-purpose microprocessors. General-purpose microprocessors are designed to be programmable by the user to perform any of a wide range of tasks, and are therefore often used as the central processing unit (CPU) in equipment such as personal computers. Special-purpose microprocessors, in contrast, are designed to provide performance improvement for specific predetermined arithmetic and logical functions for which the user intends to use the microprocessor. By knowing the primary function of the microprocessor, the designer can structure the microprocessor architecture in such a manner that the performance of the specific function by the special-purpose microprocessor greatly exceeds the performance of the same function by a general-purpose microprocessor regardless of the program implemented by the user.
One such function that can be performed by a special-purpose microprocessor at a greatly improved rate is digital signal processing. Digital signal processing generally involves the representation, transmission, and manipulation of signals, using numerical techniques and a type of special-purpose microprocessor known as a digital signal processor (DSP). Digital signal processing typically requires the manipulation of large volumes of data, and a digital signal processor is optimized to efficiently perform the intensive computation and memory access operations associated with this data manipulation. For example, computations for performing Fast Fourier Transforms (FFTs) and for implementing digital filters consist to a large degree of repetitive operations such as multiply-and-add and multiple-bit-shift. DSPs can be specifically adapted for these repetitive functions, and provide a substantial performance improvement over general-purpose microprocessors in, for example, real-time applications such as image and speech processing.
DSPs are central to the operation of many of today""s electronic products, such as high-speed modems, high-density disk drives, digital cellular phones, complex automotive systems, and video-conferencing equipment. DSPs will enable a wide variety of other digital systems in the future, such as video-phones, network processing, natural speech interfaces, and ultra-high speed modems. The demands placed upon DSPs in these and other applications continue to grow as consumers seek increased performance from their digital products, and as the convergence of the communications, computer and consumer industries creates completely new digital products.
Designers have succeeded in increasing the performance of DSPs, and microprocessors in general, by increasing clock speeds, by removing data processing bottlenecks in circuit architecture, by incorporating multiple execution units on a single processor circuit, and by developing optimizing compilers that schedule operations to be executed by the processor in an efficient manner. The increasing demands of technology and the marketplace make desirable even further structural and process improvements in processing devices, application systems and methods of operation and manufacture.
In accordance with a preferred embodiment of the invention, there is disclosed a data processing system which efficiently packs register data while storing it to memory using a single processor instruction. The system comprises a memory comprising a plurality of memory locations, and a central processing unit core comprising at least one register file with a plurality of registers. The core is connected to the memory for loading data from and storing data to the memory locations. The core is responsive to a load instruction to retrieve at least one data word from the memory and parse the data word over selected parts of at least two data registers in the register file. The number of data registers is greater than the number of data words parsed into the registers. In a further embodiment, the load instruction selects sign or zero extend for the data parsed into the data registers. In another embodiment, the parse comprises unpacking the lower and higher half-words of each data word into a pair of data registers. In yet another embodiment, the parse comprises unpacking the bytes of each data word into the lower and higher half-words of each of a pair of data registers. In yet another embodiment, the data is interleaved as it is parsed into the data registers.
In accordance with another preferred embodiment of the invention, there is disclosed a data processing system which unpacks data read from memory while loading it into registers using a single processor instruction. The system comprises a memory comprising a plurality of memory locations, and a central processing unit core comprising at least one register file with a plurality of registers. The core is connected to the memory for loading data from and storing data to the memory locations. The core is responsive to a store instruction to concatenate data from selected parts of at least two data registers into at least one data word and save the data word to memory. The number of data registers is greater than the number of data words concatenated from the data registers. In a further embodiment, there are two data registers and the concatenate comprises packing the lower half-words of the two data registers into the lower and higher half-words of the data word. In another embodiment, there are four data registers and two data words, and the concatenate comprises packing the lower half-words of the four data registers into the lower and higher half-words of each of the two data words. In yet another embodiment, there are two data registers, and the concatenate packs the lower bytes of the lower and higher half-words of each of the two data registers into the data word. In yet another embodiment, the data is interleaved as it is concatenated into the data word.
An advantage of the inventive concepts is that both memory storage space and central processor unit resources can be utilized efficiently when working with packed data. A single store or load instruction can perform all of the tasks that used to take several instructions, while at the same time conserving memory space.