The present invention is directed, in general, to digital signal processors (DSPs) and, more specifically, to a method and apparatus for controlling hardware loops in a DSP.
The availability of high-speed data communications is creating greater demand for ever-faster digital signal processors (DSPs). Digital signal processors are used in mobile phones, cordless phones, wireless personal digital assistant (PDA) devices, local area network (LAN) cards, cable modems, and a host of radio frequency (RF) communication devices, including conventional and high-definition television (HDTV) sets and radio receivers. A number of different approaches have been taken to decrease instruction execution time, thereby increasing DSP throughput.
Many digital signal processors use one or more hardware loops to execute a sequence of instructions. A hardware loop provides true xe2x80x9czero overheadxe2x80x9d loops of instructions in that no initialization instructions are needed for the loop and no dedicated branch instruction at the end of the loop are needed to branching back to the start of the loop. In a typical design, a digital signal processor (DSP) may implement, for example, three fully nested hardware loops. The DSP hardware loop architecture may comprise:
(a) three loop start registers (LSR0, LSR1, LSR2);
(b) three loop end registers (LER0, LER1, LER2);
(c) three loop count registers (LCR0, LCR1, LCR2); and
(d) three loop counter reload registers (RLD0, RLD1, RLD2).
In addition to the hardware loop architecture, a DSP may comprise a sixteen (16) instruction loop buffer capable of holding short loops. The DSP uses the loop buffer to feed instructions to the decode stage and to avoid memory fetches, as long as the loop is fully contained in the loop buffer.
A conventional digital signal processor that implements a hardware loop for executing nested loops of instructions is described in U.S. Pat. No. 5,710,913 to Gupto et al. The teachings of U.S. Pat. No. 5,710,913 are hereby incorporated by reference into the present- disclosure as if fully set forth herein.
Unfortunately, the circuitry used for determining when to fill and when to evict the loop buffer is quite complex. Adding to this complexity is a DSP architecture capable of executing multiple instruction sets of varying instruction sizes with minimal switching overhead. Hardware loops can be formed with any of the supported instruction sets. This increases the size of the integrated circuit (IC) and increases the overall power consumption of the DSP. Also, the DSP may use six (6) address comparators to compare the fetch address against the loop start registers and the loop end registers. These comparators also increase IC size and power consumption.
Therefore, there is a need in the art for improved digital signal processors that use more efficient hardware instruction loops. In particular, there is a need for DSP hardware loop architectures that minimize the amount of circuit space and energy used by the hardware loop(s). More particularly, there is a need for DSP hardware loop architecture that reduces the complexity of the loop buffer management circuitry and that reduces the number of comparator circuits required to compare the fetch address to the loop start registers and the loop end registers.
To address the above-discussed deficiencies of the prior art, it is a primary object of the present invention to provide, for use in a digital signal processor comprising an instruction fetch stage, a decode stage, a dispatch stage, and an execute stage, an apparatus for dynamically sizing a hardware loop capable of executing a plurality of instruction sequences forming a plurality of instruction loops. According to an advantageous embodiment of the present invention, the apparatus comprises: 1) N pairs of loop start registers and loop end registers, each loop start register capable of storing a loop start address and each loop end register capable of storing a loop end address; 2) N comparators, each of the N comparators associated with one of the N pairs of loop start registers and loop end registers, wherein each of the N comparators is capable of comparing a selected one of a first loop start address and a first loop end address to a fetch program counter value to detect one of a loop start hit and a loop end hit; and 3) fetch address generation circuitry capable of detecting the loop start hit and the loop end hit and fetching from an address in a program memory an instruction associated with one of the loop start hit and the loop end hit and loading the fetched instruction into the hardware loop.
According to one embodiment of the present invention, the apparatus for dynamically sizing a hardware loop as set forth in claim 1 wherein the N pairs of loop start registers and loop end registers comprise three pairs of loop start registers and loop end registers.
According to another embodiment of the present invention, the three pairs of loop start registers and loop end registers comprise a first loop start register and a first loop end register associated with a first instruction loop, a second loop start register and a second loop end register associated with a second instruction loop, and a third loop start register and a third loop end register associated with a third instruction loop.
According to still another embodiment of the present invention, the apparatus for dynamically sizing a hardware loop further comprises a loop buffer having a loop buffer size M capable of storing M instructions.
According to yet another embodiment of the present invention, the apparatus for dynamically sizing a hardware loop further comprises a loop buffer comparator capable of comparing the loop buffer size M to a difference between a first loop start address and a first loop end address to determine if a selected instruction loop associated with the first loop start address and the first loop end address is capable of fitting in the loop buffer.
According to a further embodiment of the present invention, the loop buffer comparator performs the comparison whenever one of the loop start registers and the loop end registers is updated.
The foregoing has outlined rather broadly the features and technical advantages of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features and advantages of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art should appreciate that they may readily use the conception and the specific embodiment disclosed as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. Those skilled in the art should also realize that such equivalent constructions do-not depart from the spirit and scope of the invention in its broadest form.
Before undertaking the DETAILED DESCRIPTION OF THE INVENTION below, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document: the terms xe2x80x9cincludexe2x80x9d and xe2x80x9ccomprise,xe2x80x9d as well as derivatives thereof, mean inclusion without limitation; the term xe2x80x9cor,xe2x80x9d is inclusive, meaning and/or; the phrases xe2x80x9cassociated withxe2x80x9d and xe2x80x9cassociated therewith,xe2x80x9d as well as derivatives thereof, may mean to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, or the like; and the term xe2x80x9ccontrollerxe2x80x9d means any device, system or part thereof that controls at least one operation, such a device may be implemented in hardware, firmware or software, or some combination of at least two of the same. It should be noted that the functionality associated with any particular controller may be centralized or distributed, whether locally or remotely. Definitions for certain words and phrases are provided throughout this patent document, those of ordinary skill in the art should understand that in many, if not most instances, such definitions apply to prior, as well as future uses of such defined words and phrases.