1. Field of the Invention
The present invention generally relates to area-efficient digital elements, particularly digital filters, and more particularly, to apparatus and methods for improving spatial efficiency and related performance in finite impulse response (FIR) filters in a high speed communication system.
2. State of the Art
Applications employing high-performance digital signal processing (DSP) techniques are becoming ubiquitous. The implementations of such high-performance digital signal processors, were carried out at the expense of high power consumption and heat dissipation, large die size, and die cost. Many consumer uses, such as personal computers, mass Internet working, portable communication devices, and the like, became feasible as sub-micron VLSI techniques evolved, and were incorporated into DSP processing components. As a result, very high performance DSP devices and systems can be offered in much smaller packages with substantial reductions in power requirements, heat dissipation, cost, and so forth. However, the demand for components in systems having ever-higher speed and precision needs coupled with ever-decreasing cost requirements is unabated. One of the means by which these needs can be met is to integrate what was once a multi-board xe2x80x9csystemxe2x80x9d into a monolithic VLSI chip. Such integration requires many constituent computational modules to be laid out in a compact, dense, efficient chip architecture. A key factor in developing an efficient layout is minimized interconnect routing between the functional elements and modules within a processor.
Typically, as the level of device integration increases, interconnect length can become the one of the most significant factors in determining VLSI system performance. For example, the propagation speed of a signal through an interconnect is dependent upon its length, partly due to the contribution of length to the inherent resistance and capacitance of the interconnect. For global interconnects spanning significant distances of an IC, the effect can be on the order of the square of the interconnect length. On the other hand, local, intra-element interconnects tend to experience propagation delays that are roughly proportional to the interconnect length. Scaling also tends to decrease the power dissipation per gate, which diminishes the ability of a gate to drive the capacitance of an interconnect. Thus, even if a local interconnect is relatively short, inessential length can be highly undesirable.
Even if the impact of an individual, local interconnect may be deemed relatively modest, the vast numbers of local interconnects distributed throughout a highly-integrated, high-speed VLSI system can, in the aggregate, have a significant cumulative reduction in system performance. Thus, interconnect length can be the dominating factor in determining circuit performance. Indeed, despite the benefits derived from deep sub-micron device scaling, it can be difficult to take full advantage of the higher switching speeds inherent in scaled devices when the propagation of signals throughout the IC is impaired by relatively long, indirect interconnects.
Existing approaches intended to effect short metal interconnect routes typically result in irregular structures that do not lend themselves to very compact and dense layouts. Conversely, architectural approaches intended to provide a compact, dense layout usually result in relatively long, indirect interconnect lengths, and can require additional metal layers to implement functional element interconnection.
What is needed is a digital element architecture that realizes a very compact and dense layout using functional elements having regular structure and using direct, minimal interconnects between adjacent functional elements.
The present invention satisfies the above need by providing a digital element that employs a permuted bit-order functional element, which functional element is adapted to perform a preselected function. The functional element is coupled to the input and output data paths of the digital element, and is disposed such that the bit locations of the data paths are arranged in a predetermined bit-order sequence. The functional element can be adapted to provide a permuted bit-order sequence on selected bit locations of the input data path, or the output data path, or both. The permuted bit-order sequence exhibits a predetermined bit-order ordinal discontinuity, which effects substantially straight and direct interconnects, having minimized length, between adjacent structures, thus creating a very compact and dense functional module layout. The functional element also can be adapted to provide a transposed permuted bit-order sequence on selected bit locations of the input data path, or the output data path, or both. The transposed permuted bit-order sequence exhibits a predetermined bit-order ordinal discontinuity in combination with at least a portion of the data path being transposed, relative to customary data path layouts. A functional element having a transposed permuted bit-order effects substantially straight and direct interconnects, having minimized length, between adjacent structures, thus creating a very compact and dense functional module layout.
The constituent components of one embodiment of a functional element according to the present invention can include, for example, multiple primitive logic units such as an AND device, an OR device, an XOR device, a NAND device, a NOR device, a NEXOR device, an inverter, or combinations thereof, disposed such that the resultant functional element is an electronic device having multiple bit locations arranged in other than ordinal numerical order. Another embodiment of a functional element according to the present invention can include combinations of these elementary functional elements, which perform essential arithmetic, logic, and switching functions. Such embodiments of functional elements can include, without limitation, an accumulator, a multiplier, a divider, an adder, a counter, a shifter, a decoder, a controller, a multiplexer, a storage (e.g., memory) element, a logic array, and combinations thereof.
Preferred embodiments of the present invention contemplate a permuted bit-order accumulator, as well as a permuted bit-order accumulator coupled with a multiplier, using an interconnect that is substantially direct and of minimized length. Additional preferred embodiments of the present invention contemplate a transposed permuted bit-order accumulator, as well as a permuted bit-order accumulator coupled with a multiplier, using an interconnect that is substantially direct and of minimized length. A FIR filter tap module of the present invention can include such an accumulator and multiplier, such that an area-efficient FIR filter results therefrom. Indeed, in one preferred embodiment of the invention herein, an integrated multiplier/accumulator (MAC), is provided.
Furthermore, yet other embodiments of the invention herein include functional elements having even greater complexity, so that the functional elements of the invention herein comprehend a complete hierarchy of components, devices, subsystems, and systems including, without limitation, arithmetic logic units; computational modules; processors, including without limitation, general-purpose and digital signal processors; filter taps modules; FIR filters, including a direct-transposed FIR filter; transceivers, including without limitation, gigabit Ethernet transceivers; and communications systems.