1. Field of the Invention
The present invention relates to a microprocessor architecture, and more specifically, to a high speed apparatus for branch detection of a LOOP instruction in the microprocessor.
2. Technical Background
A LOOP instruction, which controls a specific number of repeats of actions (i.e., loop operations) in a microprocessor, is one of the most frequently used instructions. For example, a number of loop operations controlled by a LOOP instruction are essential for using an iteration method to find the solutions of equations. The iteration calculation can be carried out by repeating the modification to iterative values which are iterated into the equations. Some constraints may be applied to the iterative values or the number of loop operations to terminate the LOOP instruction. Therefore, when the LOOP instruction is applied, a subtraction operation may be performed by an arithmetic logic unit (ALU) and other related elements in the microprocessor to determine whether the LOOP instruction should be terminated.
The aforementioned operation of the LOOP instruction can be carried out in the microprocessor by the architecture illustrated in FIG. 1. Referring to FIG. 1, the architecture includes register 11, ALU port 13, ALU 15, flag generator 17 and branch detector 19. The state information of the loop operations is stored in register 11. The information includes a count-down value which initially equals the number of loop operations that the LOOP instruction has to perform. ALU 15 obtains the count-down value through ALU port 13, subtracts the countdown value by 1, and writes an updated count-down value back to register 11 whenever each loop operation is about to proceed. At the same time, flag generator 17 uses the updated count-down value to generate a flag. Then branch detector 19 acquires the flag to detect if it is a zero-flag which means to terminate the LOOP instruction.
With the rapid progress being made in VLSI technology, the clock rates microprocessors have increased to about several tens or even hundreds of MHz. That is, elements in the microprocessor are driven by the clock whose period is shorter than 100 nS. Therefore, the structure shown in FIG. 1 can hardly finish the zero-flag detection of each loop operation in such a short clock period. Obviously, if the zero-flag detection cannot be finished in time, an erroneous loop operation may go on even though the LOOP instruction should be terminated.
In order to overcome this problem, more than one clock period is typically required to perform the zero-flag detection, and each loop operation must be delayed. For example, referring to FIG. 2, flag register 18 can be introduced into the structure of FIG. 1. Flag register 18 is connected between flag generator 17 and branch detector 19 for temporarily recording the flag generated by flag generator 17 in a first time period. The flag recorded in flag register 18 can be accessed and detected by branch detector 19 in a second time period. The first and second time period may each consist of one or more clock periods. Since a longer execution time is required to determine whether the next loop operation can proceed, the operation efficiency of the microprocessor is affected when the LOOP instruction is executed therein. Moreover, since the loop operation state can be determined only when a new count-down value was generated by ALU 15, the time delay problem becomes increasingly serious as the operating speed of a high speed microprocessor increases. Therefore, a more efficient architecture to detect the loop operation state in accordance with the LOOP instruction is needed.