The present invention is directed, in general, to digital signal processors (DSPs) and, more specifically, to a method and apparatus for controlling vertical dependencies in a DSP.
The availability of high-speed data communications is creating greater demand for ever-faster digital signal processors (DSPs). Digital signal processors are used in mobile phones, cordless phones, wireless personal digital assistant (PDA) devices, local area network (LAN) cards, cable modems, and a host of radio frequency (RF) communication devices, including conventional and high-definition television (HDTV) sets and radio receivers. A number of different approaches have been taken to decrease instruction execution time, thereby increasing DSP throughput.
Traditionally, digital signal processors have been designed to perform optimally on vector code (or array code). Because of this optimization, the performance of a conventional DSP suffers when running scalar code. Scalar code is a special case of vector code in which the array contains only a single element. Despite this drawback, emerging trends in the DSP marketplace indicate that scalar processing will become an increasingly important requirement for digital signal processors.
Managing vertical dependencies poses particular problems in a super-scalar DSP architecture having more than one instruction pipeline. For example, in a 2-way super-scalar architecture, instructions may be issued in order, but may be executed out-of-order in different pipes and the results may be written to the register files out-of-order. However, the instruction must be retired in order. A vertical dependency occurs whenever a first-issued or (previous) instruction generates a result that is stored in a target register and then used by a second issued (or subsequent) instruction. If the subsequent instruction is in a different pipeline (or way) than the previous instruction, it is possible that the subsequent instruction may be ready for execution before the previous instruction is completed. This may cause the subsequent instruction to read an older version of the result from the target register.
Scalar performance can be improved by using elaborate structures like renaming registers, completion buffers, and the like. The problems associated with managing vertical dependencies may also be addressed through the use of re-order buffers. For example, Intel P6 processors, high-end SPARC processors, and PowerPC processors use these structures and deliver impressive scalar performance.
Digital signal processors and apparatuses for handling vertical dependencies in digital signal processors are described in greater detail in U.S. Pat. No. 5,748,934 to Lesartre et al., U.S. Pat. No. 5,442,757 to McFarland et al., U.S. Pat. No. 5,550,988 to Sarangdhar et al., U.S. Pat. No. 5,560,032 to Nguyen et al., U.S. Pat. No. 5,606,670 to Abramsom et al., U.S. Pat. No. 5,625,789 to Hesson et al., U.S. Pat. No. 5,627,983 to Popescu et al, U.S. Pat. No. 5,627,985 to Fetterman et al., U.S. Pat. No. 5,630,157 to Dwyer, U.S. Pat. No. 5,644,753 to Ebrahim et al., U.S. Pat. No. 5,644,759 to Lucas et al. The teachings of the above-referenced patents are hereby incorporated by reference into the present disclosure as if fully set forth herein.
Unfortunately, the prior art circuits used to handle vertical dependency problems take up silicon area, increase design complexity, and consume power. Unfortunately, silicon area and power consumption are very important considerations in communication applications, such as mobile phones, and peripheral applications.
Therefore, there is a need in the art for improved digital signal processors that provide improved scalar performance. In particular, there is a need in the art for improved digital signal processors that provide improved management of vertical dependencies during scalar operations. More particularly, there is need for improved digital signal processors that are capable of efficiently handling vertical dependencies without using complex circuitry that occupies a large amount of circuit space and that consumes a large amount of power.
To address the above-discussed deficiencies of the prior art, it is a primary object of the present invention to provide, for use in a digital signal processor comprising a first instruction pipeline and a second instruction pipeline, an apparatus for managing vertical dependencies between instructions in the first and second instruction pipelines. To accomplish this, numerical identifiers (IDs) are assigned sequentially to the destination registers as they are dispatched to either of the first and second pipelines. Additionally, if an instruction about to enter a pipeline contains one dependent source operand that requires a result from a register (xe2x80x9cthe dependent source registerxe2x80x9d) that is dependent on execution of a prior instruction still in one of the pipelines, the ID of the dependent source register is assigned to the dependent source operand. If an instruction about to enter a pipeline contains two dependent source operands that require results from two dependent source registers, the ID of the dependent source register that is younger (i.e., most recently sent into pipelines) is assigned to the corresponding one of the two dependent source operands.
At the end of the instruction pipelines, the identifiers of executed (or retired) instructions are reclaimed. The present invention tracks a sequential list of retired IDs in order to determine the next sequential ID to be retired (referred to as xe2x80x9cnext retire IDxe2x80x9d). A dispatched instruction from either pipeline is scheduled for execution by comparing the identifier associated with the source operands in the dispatched instruction with the next retire ID. If the dispatched instruction contains only one dependent source operand, the ID of the dependent source register is compared to next retire ID. If the dispatched instruction contains two dependent source operands, the previously determined younger ID assigned to one of the dependent source operands is compared to next retire ID. The dispatched instruction is scheduled for instruction only if the dependent source operand ID is less than of equal to the next retire ID.
In an exemplary embodiment, the first and second instruction pipelines comprise an instruction fetch stage, a decode stage, a dispatch stage, a schedule stage, an execution stage, and a retire stage. According to an advantageous embodiment of the present invention, the apparatus for managing vertical dependencies between instructions in the first and second instruction pipelines comprises: 1) identifier (ID) reclaim circuitry capable of determining a sequential set of retired identifiers associated with retired instructions executed by the first and second instruction pipelines, wherein the ID reclaim circuitry is further capable of determining a next retire ID sequentially following the sequentially set of retired identifiers; 2) first ID generation circuitry capable of sequentially assigning identifiers to destination registers associated with instructions entering the first and second instruction pipelines; 3) second ID generation circuitry associated with the first instruction pipeline capable of identifying a first dependent source register associated with a first dependent source operand of a first instruction entering the first instruction pipeline, and assigning an ID of the first dependent source register to the first dependent source operand; and 4) instruction scheduling circuitry capable of comparing the first dependent source operand ID of the first instruction with the next retire ID and scheduling the first instruction for execution if the first dependent source operand ID is one of: 1) less than the next retire ID and 2) equal to the next retire ID.
According to one embodiment of the present invention, the second ID generation circuitry is further capable of identifying a second dependent source register associated with a second dependent source operand of the first instruction, comparing an ID of the second dependent source register with the first dependent source register ID to determine a first younger ID and assigning the first younger ID to a corresponding one of the first dependent source operand and the second dependent source operand.
According to another embodiment of the present invention, the instruction scheduling circuitry is further capable of comparing the first younger ID associated with the corresponding one of the first and second dependent source operands with the next retire ID and scheduling the first instruction for execution if the first younger ID is one of: 1) less than the next retire ID and 2) equal to the next retire ID.
According to still another embodiment of the present invention, the apparatus for managing vertical dependencies further comprises third ID generation circuitry associated with the second instruction pipeline capable of identifying a third dependent source register associated with a third dependent source operand of a second instruction entering the second instruction pipeline, and assigning an ID of the third dependent source register to the third dependent source operand.
According to yet another embodiment of the present invention, the third ID generation circuitry is further capable of identifying a fourth dependent source register associated with a fourth dependent source operand of the second instruction, comparing an ID of the fourth dependent source register with the third dependent source register ID to determine a second younger ID and assigning the second younger ID to a corresponding one of the third dependent source operand and the fourth dependent source operand.
According to a further embodiment of the present invention, the instruction scheduling circuitry is further capable of comparing the second younger ID associated with the corresponding one of the third and fourth dependent source operands with the next retire ID and scheduling the second instruction for execution if the second younger ID is one of: 1) less than the next retire ID and 2) equal to the next retire ID.
The foregoing has outlined rather broadly the features and technical advantages of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features and advantages of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art should appreciate that they may readily use the conception and the specific embodiment disclosed as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the invention in its broadest form.
Before undertaking the DETAILED DESCRIPTION OF THE INVENTION below, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document: the terms xe2x80x9cincludexe2x80x9d and xe2x80x9ccomprise,xe2x80x9d as well as derivatives thereof, mean inclusion without limitation; the term xe2x80x9cor,xe2x80x9d is inclusive, meaning and/or; the phrases xe2x80x9cassociated withxe2x80x9d and xe2x80x9cassociated therewith,xe2x80x9d as well as derivatives thereof, may mean to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, or the like; and the term xe2x80x9ccontrollerxe2x80x9d means any device, system or part thereof that controls at least one operation, such a device may be implemented in hardware, firmware or software, or some combination of at least two of the same. It should be noted that the functionality associated with any particular controller may be centralized or distributed, whether locally or remotely. Definitions for certain words and phrases are provided throughout this patent document, those of ordinary skill in the art should understand that in many, if not most instances, such definitions apply to prior, as well as future uses of such defined words and phrases.