This invention relates to data processing devices, electronic processing and control systems and methods of their manufacture and operation.
Generally, a microprocessor is a circuit that combines the instruction-handling, arithmetic, and logical operations of a computer on a single semiconductor integrated circuit. Microprocessors can be grouped into two general classes, namely general-purpose microprocessors and special-purpose microprocessors. General-purpose microprocessors are designed to be programmable by the user to perform any of a wide range of tasks, and are therefore often used as the central processing unit (CPU) in equipment such as personal computers. Special-purpose microprocessors, in contrast, are designed to provide performance improvement for specific predetermined arithmetic and logical functions for which the user intends to use the microprocessor. By knowing the primary function of the microprocessor, the designer can structure the microprocessor architecture in such a manner that the performance of the specific function by the special-purpose microprocessor greatly exceeds the performance of the same function by a general-purpose microprocessor regardless of the program implemented by the user.
One such function that can be performed by a special-purpose microprocessor at a greatly improved rate is digital signal processing. Digital signal processing generally involves the representation, transmission, and manipulation of signals, using numerical techniques and a type of special-purpose microprocessor known as a digital signal processor (DSP). Digital signal processing typically requires the manipulation of large volumes of data, and a digital signal processor is optimized to efficiently perform the intensive computation and memory access operations associated with this data manipulation. For example, computations for performing Fast Fourier Transforms (FFTs) and for implementing digital filters consist to a large degree of repetitive operations such as multiply-and-add and multiple-bit-shift. DSPs can be specifically adapted for these repetitive functions, and provide a substantial performance improvement over general-purpose microprocessors in, for example, real-time applications such as image and speech processing.
DSPs are central to the operation of many of today""s electronic products, such as high-speed modems, high-density disk drives, digital cellular phones, complex automotive systems, and video-conferencing equipment. DSPs will enable a wide variety of other digital systems in the future, such as video-phones, network processing, natural speech interfaces, and ultra-high speed modems. The demands placed upon DSPs in these and other applications continue to grow as consumers seek increased performance from their digital products, and as the convergence of the communications, computer and consumer industries creates completely new digital products.
Designers have succeeded in increasing the performance of DSPs, and microprocessors in general, by increasing clock speeds, by removing data processing bottlenecks in circuit architecture, by incorporating multiple execution units on a single processor circuit, and by developing optimizing compilers that schedule operations to be executed by the processor in an efficient manner. The increasing demands of technology and the marketplace make desirable even further structural and process improvements in processing devices, application systems and methods of operation and manufacture.
In accordance with a preferred embodiment of the invention, there is disclosed a data processing apparatus which quickly and efficiently produces a diagonally mirrored image of an array or block of data. The apparatus comprises a first input operand consisting of a first half of an Nxc3x97N bit data block and a second input operand consisting of a second half of an Nxc3x97N bit data block. A first hardware bit transformation stores an upper half of an N-way bit deal of the first and second operands, and a second hardware bit transformation stores a lower half of the N-way bit deal. The upper and lower halves of the N-way bit deal represent a diagonally mirrored image of the Nxc3x97N bit data block.
In a further embodiment, the first input operand is read from a first input register, the second input operand is read from a second input register, the upper half of the N-way bit deal is stored in a first destination register, and the lower half of the N-way bit deal is stored in a second destination register.
In accordance with another preferred embodiment of the invention, there is disclosed a method of generating a diagonally mirrored image of an Nxc3x97N bit data block. The method comprises retrieving a first N/2 N-bit rows of the data block from a memory and packing the first N/2 rows into a first input operand loaded into a first input register, and retrieving a second N/2 N-bit rows of the data block from the memory and packing the second N/2 rows into a second input operand loaded into a second input register. A first hardware bit transformation is performed storing an upper half of an N-way bit deal of the first and second input operands to a first destination register. A second hardware bit transformation is also performed storing a lower half of an N-way bit deal of the first and second input operands to a second destination register. N N-bit data segments from the first and second destination registers are unpacked and the data segments are stored to the memory, whereby the N N-bit data segments represent the diagonally mirrored image of the Nxc3x97N bit data block.
In accordance with another preferred embodiment of the invention, there is disclosed a method of generating a diagonally mirrored image of an Mxc3x97M bit data block. The method comprises dividing the Mxc3x97M bit data block into Y Nxc3x97N bit data blocks, wherein M=Nxc3x97Z, Z is an integer greater than one, and Y=Z2. The method further comprises generating minor diagonally mirrored images of each of the Nxc3x97N bit data blocks. Each minor transformation comprises retrieving a first N/2 N-bit rows of the Nxc3x97N data block from a memory and packing the first N/2 rows into a first input operand loaded into a first input register, retrieving a second N/2 N-bit rows of the Nxc3x97N data block from the memory and packing the second N/2 rows into a second input operand loaded into a second input register, performing a first hardware bit transformation storing an upper half of an N-way bit deal of the first and second input operands to a first destination register, performing a second hardware bit transformation storing a lower half of an N-way bit deal of the first and second input operands to a second destination register, unpacking N N-bit data segments from the first and second destination registers, and storing the minor diagonally mirrored image to the memory, wherein Nxc3x97N data block A and Nxc3x97N data block B are swapped in memory if block A and block B are mirror image blocks of each other about a major diagonal of the Mxc3x97M bit data block where a b for bit(a,b).
An advantage of the inventive concepts is that an operation which is cumbersome and slow to perform in software is significantly speeded up without adding excess complexity to the hardware design.