1. Field of the Invention:
This invention relates to computer systems and, more particularly, to methods and apparatus for implementing processors used in reduced instruction set computers.
2. History of the Prior Art:
The development of digital computers has progressed through a series of stages beginning with processors which were able to carry out only a few basic instructions which were programmed at a machine language level and continuing to processors capable of handling very complicated instructions written in high level languages. At least one of the reasons for this development has been that high level languages are easier for programmers to use, and thus more programs are developed more rapidly. Another reason is that up to some point in the development, the more advanced machines executed operations more rapidly.
There came a point, however, where the constant increase in the ability of the computers to run more complicated instructions actually began to slow the operation of the computer over what investigators felt was possible with machines operating with only a small number of basic instructions. These investigators began to design advanced machines for running a limited number of instructions, a so-called reduced instruction set, and were able to demonstrate that these machines did, in fact, operate more rapidly for some types of operations. Thus began the reduced instruction set computer which has become known by its acronym, RISC.
One design of a RISC computer is based on the Scalable Process Architecture (SPARC) designed by Sun Microsystems, Inc., Mountain View, Calif., and implemented in the line of SPARC computers manufactured by that company. One salient feature of the SPARC computers is the design of the processors, especially the architecture of the general purpose registers.
The general purpose registers include from forty to five hundred and twenty 32 bit registers. Whatever the total number of general registers, these registers are partitioned into eight global registers and a number of sixteen registers sets, each set divided into eight IN and eight local registers. At any time, an instruction can access a window including the eight global registers, the IN and local registers of one set of registers, and the IN registers of a logically-adjacent set of registers. These IN registers of the logically-adjacent set of registers are addressed as the OUT registers of the sixteen register set of the window including both IN and local registers. Thus, an instruction can access a window including the eight global registers, the IN and local registers of one set of registers, and the IN registers (addressed as OUT registers) of the logically adjacent set of registers.
This architecture provides a number of advantages not the least of which is that the processor may switch from register set to register set without having to save memory and restore all of the information being handled by a particular register set before proceeding to the operation handled by the next register set. For example, since the IN registers of one register set are the same registers as the OUT registers of the preceding set of registers, the information in these registers may be utilized immediately by the next or previous sets of registers without the necessity of saving the information to memory and writing the information to the IN registers of the next set of registers. This saves a great deal of system operating time. Moreover, the large number of register sets which may be utilized in the SPARC architecture allows a great number of operations to be implemented simultaneously, in many cases without the need to save to memory and restore before proceeding with the operation in any particular register set. This offers great speed advantages over other forms of RISC architecture.
However, no matter how philosophically advanced the SPARC architecture, it requires implementation in hardware. One such implementation, described in U.S. patent application Ser. No. 07/437,978, entitled Method and Apparatus for Current Window Cache, Eric H. Jensen, filed Nov. 16, 1989, includes a processor made up of a large register file usually constructed of random access memory divided into a plurality of sets of windowed registers. In accordance with the general SPARC architecture, each such set includes a first plurality of IN registers and a second plurality of local registers. The IN registers of each set are addressable as the OUT registers of a logically-adjacent preceding set of registers while the OUT registers of each set are addressable as the IN registers of a logically-adjacent succeeding set of registers. A set of global registers which may be addressed with each of the sets of registers is provided along with circuit means for indicating which set of windowed registers is being addressed.
The processor also includes an arithmetic and logic unit and a cache memory comprising a number lines at least equal to the total of the number of registers in an addressable set of windowed registers including the set of global registers, a set of IN registers, a set of OUT registers, and a set of local registers. The cache is provided with circuitry for changing the addresses of lines of the cache holding information presently designated as information held in OUT registers to addresses designating the IN registers of the next register set and vice versa. This arrangement essentially functions as a very rapid processor by using the registers of the cache in most cases in place of the normal register file. The use of the cache for processing allows cache speeds to be attained most of the time in processing even though the register file is constructed of relatively inexpensive random access memory and includes a very large number of window sets. The use of circuitry for changing the addresses of lines of the cache holding information presently designated as information held in OUT registers to addresses designating the IN registers of the next register set and vice versa allows the single set of IN, OUT, local, and global registers of the cache to accomplish in a single cache window the transfer between windowed sets without most of the store and restore operations which usually requires multiple windowed register sets.
Such a cache based processor functions well to increase the speed of operation of a SPARC based processor. There seems to be no upper level to the speed desired from a processor, however; and consequently even faster operation is desirable.