Microprocessors routinely perform instruction loops. FIG. 1 depicts a conventional method 10 for performing an instruction loop. The method 10 is used for performing the loop one or more times. Generally, the instructions in the loop are performed more than once. FIG. 2 depicts a conventional system 50 that is used in performing the instruction loop. The conventional system 50 includes a conventional fetcher 52 and conventional addition logic 54. The conventional addition logic 54 calculates a next sequential address for an instruction in the loop which the fetcher 52 will get. In general, the conventional addition logic performs the operation An XOR Bn XOR Cn, where An is the nth digit of the address for instruction that was just performed, Bn is the nth digit of the number one (added to increment the address), and Cn is the nth digit of the carry from the previous digit.
The conventional method 10 commences after the fetcher 50 has fetched a set of instructions in contiguous addresses. The current instruction is performed, via step 12. The first time the step 12 is performed, the current instruction is the first instruction in the loop. Consequently, the current address is the address of the first instruction. It is determined if the last instruction in the loop is the current instruction that was just performed, via step 14. If not, then the next instruction in the loop is set as the current instruction, via step 16. Step 16 includes determining the current address and using the instruction at that address as the current instruction. Thus, step 16 generally includes using the conventional addition logic 54 to add one to the address of the current instruction and then setting the instruction corresponding to the new address as the current instruction. It is determined if the loop has been performed the requisite number of times, via step 18. Step 18 thus determines whether the last iteration of the loop has just been performed. In one embodiment, step 18 determines if a count corresponding to the number of times the loop is to be performed is zero. Alternatively, step 18 might determine if the count corresponding to the number of times the loop is to be performed has reached that number. If the last iteration has been performed, then the conventional method 10 terminates. Otherwise, a count of the number of times the loop has been performed is adjusted, either by incrementing or by decrementing the count, via step 20. The method 10 branches to the first instruction in the loop, via step 22. Consequently, the conventional fetcher 52 is flushed, via step 24. The conventional fetcher 52 fetches a set of contiguous addresses that correspond to the first instructions in the loop, sets the current instruction as the first instruction and the current address as the address of the first instruction, via step 26. Step 12 is then returned to.
Although the conventional method 10 and system 50 function, one of ordinary skill in the art will readily recognize that the conventional method 10 and system 50 are inefficient. Each time the loop branches back to the first instruction, the conventional fetcher 52 is flushed. Flushing the conventional fetcher 52 generally requires two cycles. Thus, each time the loop is performed, there are approximately two dead cycles. As a result, the loop is less efficient than if there were one or zero dead cycles. For a short loop including a relatively small number of instructions, the dead cycles constitute a significant portion of the overhead for the loop. For example, if the loop includes four or two instructions, the flush of the conventional fetcher 52 consumes fifty to one hundred percent of the time used to perform the instructions in the loop. Such shorter loops are often used in computer systems. Consequently, the conventional method 10 and system 50 are relatively inefficient.
Accordingly, what are needed are a more efficient method and system for performing a loop, preferably multiple times. The present invention addresses such a need.