This application claims priority to S.N. 99400557.7, filed in Europe Patent Office on Mar. 8, 1999 and S.N. 98402455.4, filed in Europe Patent Office on Oct. 6, 1998.
The present invention relates to processing engines configurable to manipulate bit fields, and in particular but not exclusively, to digital signal processors.
Many different types of processing engine are known, of which microprocessors are but one example. For example, Digital Signal Processors (DSPs) are widely used, in particular for specific applications. DSPs are typically configured to optimise the performance of the applications concerned and to achieve this they employ more specialised execution units and instruction sets.
A particular application of DSPs is in telecommunications equipment. Modern telecommunications systems require a high level of processing of the data transmitted over the system, for example for channel coding and de-coding, interleaving and de-interleaving or error checking or correction techniques such as cyclic redundancy checking. Such processing operates on bit fields within the data, which are expanded into or extracted from data packets transmitted over the telecommunication system and provide for data to be split and reconstructed in a manner suitable for improving or enhancing the robustness or integrity of the transmission channel. Another application is in pure signal processing, where it is frequently necessary to normalize a table of numbers representing a sampled signal, in order to maintain a good level of accuracy throughout the signal processing task. Such normalization may occur during floating point operations. Typically a normalized number, such as the mantissa, has its sign bit in the most significant bit (msb) position and the complement of the sign bit on the next lower bit position (in the form of S, not (S), X, Y, . . . Z). A mantissa which is not normalized has more than one sign bit from its most to its least significant bit (in the form of S,S, . . . , S, not (S), X, Y, . . . Z). Normalization typically comprises finding the bit location of the xe2x80x9cS, not (S)xe2x80x9d sequence within the number (typically called exponent computation) and shifting the xe2x80x9cS, not (S)xe2x80x9d sequence to the most significant bits. Often, the msb for the mantissa value is not the same as the msb for the data word comprising the mantissa. For example, for a 40 bit data word the msb for the mantissa value, or normalized sign position bit, may be bit 31. The remaining more significant bits serve as overflow bits during floating point arithmetic. Thus, even for a normalized mantissa there could be a sequence of sign bits from the msb of the data word to the msb of the mantissa.
It is known to implement bit processing techniques either in software or hardware. Software implementations generally have a slow performance and a large code size. Hardware implementations are generally less flexible, and are physically located in different processor units, for example one in a fixed-point module and the other in a floating-point module. Additionally, bit field extract or expand functions have been implemented in a limited fashion that requires the bits to be extracted to be in a contiguous area of the data source, that is data memory. This is due to the fact that known bit field extract or expand functions utilize a shifter combined with mask logic. Additionally, the shifter is limited to performing only the bit field extract or expand functions, and is unable to be used in parallel for other tasks.
The present invention is directed to improving the performance of processing engines such as, for example but not exclusively, digital signal processors.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Combinations of features from the dependent claims may be combined with features of the independent claims as appropriate and not merely as explicitly set out in the claims.
In accordance with a first aspect of the invention, there is provided an execution unit for a processing machine comprising first circuitry adapted to derive an intermediate signal input thereto. This is also provided further circuitry for receiving an intermediate signal and operating on it and/or receiving and operating on a signal associated with the input signal, in accordance with the intermediate signal in order to provide a further signal derived from the input signal and/or associated signal.
In a first embodiment in accordance with the invention, the further circuitry comprises tail-part circuitry operable with the first circuitry to form a composite bit counter, and to process intermediate signal in order to provide a bit count of the input signal.
In accordance with a second embodiment of the invention, the further circuitry comprises circuitry operable with the first circuitry for forming composite bit extract circuitry, and for utilizing the intermediate signal to provide a bit sequence extracted from the associated signal in accordance with the input signal.
In accordance with a third embodiment of the invention, the further circuitry comprises circuitry operable with said first circuitry for forming a composite bit expand circuitry. The intermediate signal is utilized to provide a bit sequence expanded from said associated signal in accordance with said input signal.
In accordance with a fourth embodiment of the invention, the further circuitry comprises tail-part circuitry operable with the first circuitry for forming composite exponent counting circuitry, utilizing said intermediate signal to provide a shift value for normalizing the input signal.
The foregoing embodiments in accordance with the invention may be provided together in any combination of two or more embodiments. An advantage of such a combination is that there is provided a common architecture for bit processing functionality typically required in signal processing applications. In particular, bit count circuitry, bit field extract and bit field expand circuitry and exponent counting circuitry may be combined into a common architecture sharing a first or head part circuitry which produces signals utilizable by separate specific circuitry for performing separate signal processing functions. Such an architecture advantageously reduces the surface area required by the signal processing functions on integrated circuits, thereby reducing the cost of such integrated circuits, or providing opportunities for further circuits and functionality on that integrated circuit.
Additionally, the common architecture supports bit field extract and expand functions for non-contiguous bits, such as bits from non-contiguous locations in an accumulator register, for example. Additionally, the common architecture provides single cycle operation within a processor timing sequence.
The further circuitry typically comprises separate circuits for performing respective signal processing operations on the associated signal in accordance with the intermediate signal, and/or for performing respective operations on the intermediate signal to derive respective signal processor results from the input signal. Each of said separate circuits can be directed to one or more of the preferred embodiments such as bit expand or extract, bit count or exponent count circuitry.
Preferably the first circuitry is adapted to derive the intermediate signal such that it is optimally configured for utilization by the further circuitry. More preferably the first circuitry is adapted to operate on segments of the input signal, said segments being sized to provide the optimally configured intermediate signal. Such optimal configuration may be directed towards reducing the number of processing elements or gates required for implementing the first and further circuitry, thereby reducing the surface area required by the common architecture.
The first circuitry and separate ones of the further circuitry provide respective composite signal processing circuitry for respective signal operations, and such respective composite signal processing circuitry may be formed non-concurrently with other of said respective composite signal processing circuitry. Thus, the common architecture may be configured to operate in parallel thereby providing enhanced speed of operation, or one composite signal processing circuit at a time.
Typically, the first circuitry is operable to determine the number of occurrences of a date attribute in respective segments of the input signal, wherein the input signal generally comprises a sequence binary digits (bits). Suitably, the respective segments comprise groups of bits. The data attribute for an input signal comprising a sequence of binary digits is one of xe2x80x9csetxe2x80x9d or xe2x80x9cnot setxe2x80x9d.
It is advantageous to provide first circuitry performing xe2x80x9cbitxe2x80x9d counting, since signal processing functions generally require knowledge of a number of bits in an input signal or segments of input signals in order to perform the function. Thus, such bit counting circuitry provides a suitable head part or first circuitry for a common architecture for signal processing functions.
Generally a first stage of a first circuitry comprises digital counters respectively associated with respective segments of the input signal, and providing an output which is representative of the number of occurrences of a data attribute in an associated segment.
The respective segments generally comprise groups of four bits, and the digital counters comprise four inputs. In a preferred embodiment, the digital counters comprise xe2x80x9cfour into twoxe2x80x9d counters.
A suitable configuration for the first stage is that of the first stage of a reduction tree for obtaining the bit count for the input signal, and a second stage of the first circuitry is suitably configured as a second stage of a reduction tree for obtaining a bit count for the input signal. In a preferred embodiment, the second stage provides the intermediate signal.
Generally, the second stage outputs a signal which is representative for a bit count for two adjacent segments of the input signal.
For the first embodiment in accordance with the invention, the tail part circuitry is configurable as a reduction tree for receiving the intermediate signal from the second stage and for providing a bit count of the input signal. Advantageously, the reduction tree is a xe2x80x9cWallace-likexe2x80x9d reduction tree which is a well known configuration. It is preferable, that the intermediate signal comprises a classically encoded bit count for the adjacent respective segments.
In accordance with the second embodiment of the invention, the further circuitry comprises means for slicing the associated signal and input signal into corresponding groups and shifting a bit positioned in an associated signal group to a position determined by a corresponding input signal group. The further circuitry further comprises means for receiving a signal corresponding to a signal from the slicing means comprising the shifted bit, and operating on that signal in accordance with controlled signals derived from intermediate signal. The means for receiving the signal is adapted to operate on a signal in larger groups than the means for slicing, and to shift a bit positioned in that signal to a position determined by the control signals. Thus, the subsequent stages of the further circuitry operate on groups of bits corresponding to larger segments or groups of the input signal.
Preferably, the further circuitry comprises a further means for receiving, which means it is adapted to operate on a signal from the means for receiving in a yet larger group, and to shift a bit positioned in said signal from said means for receiving to a position determined by further controlled signals derived from the intermediate signal.
Advantageously, the slicing means comprises an encoder, and the means for receiving and further means for receiving comprise multiplexors. The input signal comprises a bit mask for identifying the bits to be extracted from the associated signal.
In the third embodiment of the invention, the further circuitry comprises means for distributing bits from the associated signal to bit locations in at least two bit groups determined by control signals derived from intermediate signals. The further circuitry also comprises means for distributing bits of the distributed bits to bit locations in smaller bit groups determined by further control signals derived from the intermediate signal. Finally, the further circuitry comprises a further means for distributing bits of a signal corresponding to distributed bits from the means for distributing to bit locations in yet smaller groups determined by controlled signals derived from the input signal or mask.
Suitably, the means for distributing bits from the associated signal and the means for distributing bits of the distributed bits comprised in multiplexors, and the further means for distributing comprises an encoder.
In the fourth embodiment in accordance with the invention, the further circuitry comprises tail-part circuitry comprising respective processing means for receiving respective segments of the input signal and controllable by signals derivable from respective parts of the intermediate signal to provide the shift value for normalizing the input signal.
Embodiments in accordance with the invention help reduce the surface area necessary for providing circuitry for performing signal processing functions. Thus, further functionality may be incorporated on a processor, digital signal processor or integrated circuit incorporating embodiments of the invention. This makes embodiments of the invention particularly suitable for portable apparatus, such as wireless communication devices, in which maximum functionality in minimum surface area is desirable.
A particularly advantageous utilization of embodiments in accordance with the invention is in a wireless communications device comprising a user interface including a display, a keypad or key board for inputting data to the communication device, a transceiver and an antenna operable coupleable to the transceiver, and which further comprises an execution unit as described above, or a processor, a digital signal processor or integrated circuit comprising such an execution unit.