1. Field of the Invention
This invention relates to computer systems and, more particularly, to methods and apparatus for accelerating the reordering of instructions in an improved microprocessor.
2. History of the Prior Art
Recently, a new microprocessor was developed which combines a simple but very fast host processor (called “morph host”) and software (called “code morphing software”) to execute application programs designed for a processor different than the morph host processor at a rate which cannot be attained by the processor for which the programs were designed (the target processor). The morph host processor executes the code morphing software to translate the application programs into morph host processor instructions which accomplish the purpose of the original target software. As the target instructions are translated, they are both executed and stored in a translation buffer where they may be accessed without further translation. Although the initial translation and execution of a program is slow, once translated, many of the steps normally required to execute a program in hardware are eliminated.
In order to be able to execute programs designed for other processors at a rapid rate, the morph host processor includes a number of hardware enhancements. One of these enhancements is a gated store buffer which resides between the host processor and the translation buffer. A second enhancement is a set of host registers which store state of the target machine at the beginning of any sequence of target instructions being translated. Sequences of target instructions spanning known states of the target processor are translated into morph host instructions and placed in the translation buffer awaiting execution. If the translated instructions execute without raising an exception, the target state at the beginning of the sequence of instructions is updated to the target state at the point at which the sequence completed.
If an exception occurs during the execution of the sequence of host instructions which have been translated, the processing stops; and the entire operation may be returned or rolled back to the beginning of the sequence of target instructions at which known state of the target machine exists. This allows very rapid and accurate handling of exceptions while dynamically translating and executing instructions, a result which had never been accomplished by the prior art.
Additional speed is attained in running the new microprocessor by a scheduler which is part of the code morphing software. The scheduler reorders and reschedules the instructions as they are being translated from a naive order produced by raw translation into an order which produces the same result but allows faster execution. A scheduler attempts to place certain instructions ahead of other instructions or to run instructions together so that the execution of the rescheduled software takes less time. Schedulers function with a number of constraints the most basic of which is that the rescheduled program must still produce the same ultimate results as the original program.
As an example, there are sequences of instructions in programs which must be carried out without interruption in order for the sequences to produce the correct results. A scheduler cannot interfere with such sequences without interfering with the results produced. Many processors provide hardware interlocks to assure that such sequences are, in fact, run without interruption. The need to protect such sequences of instructions poses special constraints for processors without hardware interlocks such as the advanced morph host processor being discussed. Software must somehow be aware of such sequences and assure that they are run without interruption.
Control dependencies are another traditional constraint on reordering which a scheduler faces. Control dependencies relate to branch instructions; a scheduler must assure that reordering of instructions which occur before and after a branch do not cause the program to run incorrectly.
Other dependencies affect the reordering of loads with respect to stores. For example, if updated data is to be stored to a memory address and then manipulated in a register operation, the data at the address should not be kept in a register at the time the store occurs or the data in the register may be stale.
All of these constraints cause a typical scheduler to function very conservatively and, consequently, to produce slower code.
A traditional scheduler does its best to determine those instructions which depend on one another in order to accomplish reordering. The usual scheduler can determine that some operations depend on other operations in some way and that some operations do not depend on other operations in any way, but it cannot determine anything with regard to other operations. Such a scheduler treats those operations which depend on other operations conservatively by ordering them in the normal naive order in which they originated. Such a scheduler reorders operations which do not depend on other operations at all in the manner it desires. Finally, it treats all operations about which it cannot make a determination regarding dependencies as though they depended on one another and handles them conservatively and slowly.
It is desirable to provide circuitry and software for enabling a scheduler of an advanced processor to generate code which executes at an accelerated speed.