1. Field of the Invention
This invention relates generally to computer systems that store instructions in memory in compressed and uncompressed form, and more particularly to computer systems that store instructions in compressed form in a main memory and store corresponding uncompressed instructions in an instruction cache.
2. Description of the Related Art
Computer systems store instructions in a main memory. Main memories tend to have a high capacity, but they also tend to be relatively slow to access. Also, the instructions can be stored in compressed form in order to increase the capacity of the main memory. However, the compression also slows access time because the compressed instructions must be decompressed before they can be processed.
Therefore, a faster cache memory is often employed to store certain frequently-used instructions. Instructions are stored in the cache memory in uncompressed form so that they can be accessed without being delayed by decompression pre-processing. However, cache memories generally have a more limited capacity and may be able to hold only some of the instructions of an ordered instruction set. Conventionally, when an instruction is called for that is not present in the cache, a cache miss operation (or miss to main memory operation) is executed and the instruction is accessed in compressed form from the main memory based on its expected address in the cache memory.
Instructions are typically grouped into instruction blocks and instruction pages. A program consists of an ordered list of instructions having virtual addresses 0 to n. In a typical RISC system, where each instruction is 32 bits long, a computer will group these instructions into instruction pages of 1024 instructions each. Typically, the instructions are also grouped into blocks so that there are 8 instructions per instruction block and 128 instruction blocks per page.
In most computers, instruction blocks have a fixed length, regardless of whether the instructions constituting the instruction block have a fixed instruction length.
An exemplary prior art computer system 100 is shown in FIG. 1. The computer system 100 includes main memory 102, instruction cache 104, instruction decompressor 106, address translation look-up table ATLT 108, and central processing unit (CPU) 112. As shown in FIG. 1, ordered, compressed instructions CI0 to CIX are stored in main memory 102 so that they are scattered across main memory lines MA0 to MAY. As explained in detail below, ATLT 108 controls the storage of compressed instructions CI so that these instructions take up as little of the available main memory space as possible.
Because the compression is approximately 50 percent, two compressed instruction blocks CI will usually be stored on one main memory line MA. For example, instruction blocks CI0 and CI1 are stored on a main memory line MA0.
However, three or more compressed instruction blocks CI will be stored on a single main memory line MA, if they will fit. For example, instruction blocks CI2, CI3 and CI4 are stored on main memory line MA1. Also, sometimes it is only possible to fit a single instruction block CI on a main memory address line MA. For example, instruction block CI5 is the only instruction block stored at main memory address line MA4. Note that no compressed instruction blocks CI are stored at MA2 or MA3 because these lines are unavailable due to the fact that these addresses are being used to store other unrelated data.
Before the compressed instruction blocks CI0 to CIX are to be used by CPU 112, they are first translated into uncompressed form by instruction decompressor 106 and stored in instruction cache 104 as corresponding uncompressed instruction blocks having physical addresses UI0 to UIX. As shown in FIG. 1, the uncompressed instruction blocks UI0 to UIX are stored in instruction cache 104 so that each uncompressed instruction block UI is stored on a single line I$ of instruction cache 104. In order to simplify this example, UI0 to UIX are stored in order at consecutive cache lines I$0 to I$X.
When a miss to main memory 102, such as an instruction cache miss, is executed in prior art computer system 100, the missing instruction UI is translated by address lookup table 108 into a main memory line address MA, so that the uncompressed instruction block UIn, which is supposed to be stored at information cache line address l$n can be found in compressed form CIn at the corresponding main memory address MAnn (where n is any one of the ordered instruction blocks from 0 to X, and nn depends on how the compressed instructions CI are scattered through the main memory 102). For example, an instruction cache miss of instruction block UI5 involves sending UI5 to address translation look-up table 108. Address translation look-up table 108 translates the uncompressed address UI5 into the corresponding main memory compressed address MA4 because that is where the corresponding compressed instruction block CI5 is stored in main memory 102.
Translator 108 employs an address look-up table because there is no predictable correlation between addresses in the uncompressed space of information cache 104 and corresponding addresses of the compressed space in main memory 102. Address look-up table 108 takes space and time to initiate, maintain and utilize. This can result in complicated and expensive computer hardware and relatively high silicon cost.
According to the present invention, there is an xe2x80x9calgebraicxe2x80x9d relationship between memory line addresses for instruction blocks stored in uncompressed form (for example, in an instruction cache) and corresponding memory line addresses for corresponding instruction blocks stored in compressed form (for example, in a main memory). Preferably, some set integral number n of compressed instruction blocks is stored on each line of main memory, with the instruction blocks being placed in consecutive order on consecutive memory lines. In this way, line addresses in uncompressed memory space will be easy to translate to compressed addresses because uncompressed addresses will be proportional to the corresponding compressed memory addresses. Even more preferably, two instructions are stored on each consecutive line of compressed memory, because it is relatively easy to reliably utilize this two-instructions-per-line scheme, even when the compression is as low as 50 percent.
As stated above, there is an xe2x80x9calgebraicxe2x80x9d relationship between memory line addresses in uncompressed space and corresponding memory line addresses in compressed space. As used herein, xe2x80x9calgebraicxe2x80x9d refers to any mathematical function or combination of mathematical functions commonly implemented on computers. Such mathematical functions include but are not limited to addition, subtraction, multiplication, division, rounding to an integer value, exponents, factorials, logarithms, trigonometric functions and so on.
It is noted that, after compression, instruction blocks often vary in length, and the predetermined number n of instruction blocks may not fit on a single line of compressed memory space when dealing with a longer-than-expected instruction block. One solution to this problem is to set n sufficiently low so that the predetermined number of instruction blocks n will always fit in the compressed memory space. The drawback to the solution is that decreasing n will increase the amount of compressed memory space that is required. Also, n cannot be less than two.
In at least some embodiments of the present invention, a different solution is used. More particularly, a flag and a pointer are stored in compressed memory in the location which had been allocated for the longer-than-expected instruction block. The pointer points to another line address of the compressed memory where the longer-than-expected instruction block has been alternatively stored, where it is out of the way of the set of compressed addresses that follow an algebraic pattern for ease of uncompressed-to-compressed address translation.