1. Technical Field of the Invention
The present invention is concerned with a system, apparatus and method for controlling cache memory in a data processing system, and in particular to the provision of improved cache addressing in a data processing system. The present invention is of general applicability in data processing systems, and particularly where the speed at which data can be retrieved is of concern.
2. Description of the Related Art
In conventional computer systems, instructions and data are stored in main storage and fetched from main storage by a memory management system for execution or use by a central processor unit, or possibly by some special function unit, such as a floating-point processor. In some systems, some instructions and data may be retained after their use in a cache memory which can be accessed more quickly than the main storage, so that such instructions and data can be reused later in the execution of the same program. This improves the execution performance of the computer system by reducing the time taken to fetch the instructions and data for processing by the central processing unit.
In systems having caching, the number of cycles taken to retrieve an instruction or a data item depends on whether the instruction or data item is already in the cache or not, and on how many instructions are required to address or retrieve the instruction or data item. If the instruction or data item is not in the cache (a xe2x80x9ccache missxe2x80x9d), the instruction or data item must be fetched from main memory, which consumes some number of instruction cycles. If the data item or instruction is in the cache, some instruction cycles will also be consumed, although they will be fewer than in the case of a cache miss. Nevertheless, any improvement that can be made in the processing of cached data and instructions is useful, and, in certain circumstances, may make a considerable difference to the processing performance of the system.
Improvements in cache memory performance have been sought utilizing various methods of linking and associating groups of cache lines. One example is the use of set-associative caching, wherein each cache line is placed in a logically appropriate set, and the addressing mechanism then locates first the set, and then the individual cache line within that set. In caches comprising simple set-associative mechanisms based on addressing, it is not necessary to store the full address in each cache line; part of the address can be deduced from the set association itself.
Another technique frequently used is a hash table. A hash table is, in effect, an abbreviated index to the cache lines, which reduces the average time taken in searching for an entry or in determining that the data is not present in the cache and will therefore need to be fetched from main memory.
There are also various techniques for arranging the contents of a cache memory. For example, the cache lines may be arranged as a linked list, in which each element except the last contains a pointer forward to the next element. An element can thus be located by following the chain of pointer references. A more sophisticated technique makes use of a doubly-linked list, in which each element except the last contains a pointer forward to the next element, and each element except the first contains a pointer backward to the previous element. The process of locating an element in such a structure can thus proceed either forwards or backwards along the chain of pointer references.
An attempt to solve the problems of cache management using hash tables and a least-recently-used cache line eviction technique is disclosed in U.S. Pat. No. 5,778,430, which also describes the use of linked lists and circular linked lists to improve the efficiency of searching for the least-recently-used cache line.
All the known techniques of organizing and addressing caches have disadvantages in terms of the numbers of instructions required to process them. Caches addressed using set-associative techniques and hash tables can still have problems in terms of the number of instructions required to process insertions, deletions and the initial addressing of a cache line. The known cache arrangements, such as linked lists and doubly-linked lists, require extra instructions to handle the various special cases, such as the case of an empty list, or the case of a deletion of a last cache line from a list.
It is therefore an object of the present invention to provide an improved data processing system.
It is another object of the present invention to provide an improved cache memory and method of operation thereof.
To achieve the foregoing objects, and in accordance with the invention as embodied and broadly described herein, an improved cache memory is disclosed. The cache memory includes a doubly-linked loop of cache lines and a single pointer operable to address a cache line in the doubly-linked loop. A doubly-linked loop, also sometimes called a circular doubly-linked list, is advantageously utilized to provide a set of efficient primitive operations for addressing and manipulating the cache lines.
In one embodiment of the present invention, the pointer is operable to address a next cache line in the doubly-linked loop. Alternatively, in another advantageous embodiment, the pointer is operable to address a previous cache line in said doubly-linked loop. In a related embodiment, the pointer is stored in a register.
In another embodiment of the present invention, the cache memory further includes cache lines having address data, a dirty marker and an empty marker arranged as a singly-loadable unit.
In a second aspect of the present invention, a data processing system includes a processor, a main memory and at least one cache memory having a doubly-linked loop of cache lines and a single pointer operable to address a cache line in the doubly-linked loop.
In one embodiment of the present invention, the data processing system further includes a plurality of cache memories and a hash table of entries for addressing the plurality of cache memories. Alternatively, in another advantageous embodiment, the data processing system further includes a plurality of cache memories and a set associative mechanism for addressing the plurality of cache memories. In a related embodiment, the data processing system further includes a pointer stored in a register of the processor.
In a third aspect of the present invention, a method for implementing a cache memory is disclosed. The method includes utilizing a doubly-linked loop of cache lines and employing a single pointer operable to address a cache line in the doubly-linked loop. To purge a cache line, the method further includes marking a current cache line as empty and clean. Next, a pointer is employed to point to a next cache line. To load an item into the cache line, the method further includes pointing a pointer to a cache line at a least-recently-used position. Thereafter, the item to be loaded is loaded from a memory into the cache line.
The present invention advantageously utilizes the simplified pointer manipulation operations available to a doubly-linked loop to give an improvement in code speed and compactness and to reduce code complexity. The present invention does not require the special-case processing that is necessary in typical previously-existing solutions to the problems of cache memory control.
The advantage of a loop over a list is the increased symmetry. With a list, it is necessary to do special processing when you reach an end, whereas loops do not have ends, and thus the special processing is not necessary. The anchor structure for a doubly linked loop is a simple pointer, for example, to the head position. The anchor structure for a doubly-linked list is two pointersxe2x80x94one to the head position and one to the tail position. The extra pointer is necessary because the xe2x80x9cprevxe2x80x9d pointer of the head element and the xe2x80x9cnextxe2x80x9d pointer of the tail element are not used to hold useful information.
With a loop it is possible to step from the xe2x80x9ctailxe2x80x9d to the xe2x80x9cheadxe2x80x9d without additional overhead. With a doubly-linked loop it is possible to go either way. Hence it is easy to make the xe2x80x9ctailxe2x80x9d element the new xe2x80x9cheadxe2x80x9d element merely by stepping the xe2x80x9chead pointerxe2x80x9d one step back. It is not necessary to physically remove the tail element and prepend it to the head (that is, it is not necessary to break and remake any of the links). Similarly, it is possible to logically move the head element to the tail position by a simple forward step of the xe2x80x9chead pointerxe2x80x9d
It typically costs one processor instruction to step a pointer one position around the loop. The cost of removing an element and reinserting it elsewhere in the list is considerably higher.
Doubly-linked loops also have the same advantage as doubly-linked lists: it is easy to remove an element from the list. However, the overhead for adding an element to the list is higher than for a single-linked loop because there are more link pointers to update.
The present invention organizes cache lines into a doubly-linked loop structure in such a way that the single step primitive is efficiently used. Examples include promoting an unused (or LRU but occupied) cache line to the head position prior to filling it, or demoting a newly purged cache line from the head position. Additionally because the anchor structure is a single pointer, it is easier to hold it in a register within the processor for efficient operation.
Additionally the invention makes use of certain specializations of the double linked loop, most notably that there is more than one cache line within the cache. Thus, when moving a random cache line to the head position, it can be assumed that the loop will be non-empty after the removal of the cache line from its original position prior to insertion into its new position. This removes the need for code to handle the empty case, making the code simpler and faster.
The foregoing description has outlined, rather broadly, preferred and alternative features of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features of the invention will be described hereinafter that form the subject matter of the claims of the invention. Those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiment as a basis for designing or modifying other structures for carrying out the same purposes of the present invention. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the invention in its broadest form.