The present invention concerns computing devices and pertains particularly to the formation of linked lists using content addressable memory.
Linked lists are useful constructs in many applications. One such application is ensuring cache coherency in a multiprocessor (MP) system. Organizing accesses to the same memory line in a linked list allows these requests to be serviced in the arrival order even in the presence of conflicts. This ensures fairness and prevents starvation problems that could occur when conflicts are resolved using retry or other methods.
A cache memory is a small, high-speed buffer memory which is used to hold temporarily those portions of the contents of main memory which it is believed will be used in the near future by a processor. The main purpose of a cache memory is to shorten the time necessary to perform memory accesses, either for data or instruction fetch. The information located in cache memory may be accessed in much less time than information located in main memory. Thus, a processor with a cache memory needs to spend far less time waiting for instructions and operands to be fetched and/or stored.
A cache memory is made up of many cache lines of one or more words of data. Each cache line has associated with it an address tag that uniquely identifies of which memory line of main memory the cache line is a copy. Each time the processor makes a memory reference, an address tag comparison is made to see if a copy of the requested data resides in the cache memory. If the desired memory line is not in the cache memory, the memory line is retrieved from the main memory, stored in the cache memory as a cache line and supplied to the processor.
In addition to using a cache memory to retrieve data from main memory, the processor may also write data into the cache memory instead of directly to the main memory. When the processor desires to write data to the memory, the cache memory makes an address tag comparison to see if the cache line into which data is to be written resides in the cache memory. If the cache line exists in the cache memory and is modified (dirty) or exclusive, the data is written into the cache line in the cache memory. In many systems a data "dirty bit" for the cache line is then set. The dirty bit indicates that data in the cache line is dirty (i.e., has been modified), and thus before the cache line is deleted from the cache memory the modified data must be written into main memory. If the cache line into which data is to be written does not exist in the cache memory, the cache/memory line must be fetched as exclusive into the cache memory or the data written directly into the main memory.
A shared-memory MP system has a potentially large number of processors, each with its own cache(s). When an access to memory is made in such a system, it is necessary to take steps to insure the integrity of data accessed. For example, when an entity reads data from memory, it is important to determine whether an updated version of the data resides in the cache of a processor on the system. If an updated version of the data exists, something must be done to insure that the entity accesses the updated version of the data. A mechanism that assures that the updated version of the data is utilized in a memory reference is referred to herein as a coherence mechanism.
The most common coherency mechanism is snooping, which, usually, requires the processors to share a bus. However, due to electrical reasons, only a limited number of processors can share a bus. Therefore, when the number of processors in an MP system is large, snooping can no longer be efficiently used for cache coherency.
The most common cache coherency mechanism for systems with a large number of processors is a directory structure in memory. Within the directory structure, line state information exists for each memory line within the memory. The line state information consists of a number of bits for each memory line. The bits for each memory line indicate, for that memory line, the state of the memory line (Private, Shared etc.), plus extra information relevant for that memory line state. When the memory line is held "Private" in a cache of a first processor, this means that the memory line is not available for use by other processors until released by the first processor, and that first processor is allowed to modify the contents of that memory line. When the memory line is held "Shared" in a cache of a first processor, this means that the memory line is available for use by other processors as long as the other processors do not want to hold the memory line as "Private", and while the line is held "Shared" the contents of the line are not allowed to be modified.
When a processor desires to access a memory line, a request is sent to a memory controller for the memory line. The memory controller reads line state information for the memory line to determine the current state of the requested memory line. If the line state information bits for the requested memory line indicate that the memory line is held Private in another cache, the memory line is recalled to the memory controller. When the memory line comes back to the memory controller, the memory controller supplies the memory line to the requester, updates the memory line's line state information and, if necessary, updates the data for the memory line in memory. If the memory line is requested as "Private" and the memory controller reads the line state information and finds the memory line is "Shared", the memory controller invalidates copies of the memory line in other caches (as indicated by the line state information) and then supplies the memory line to the requester. The memory controller also tags the memory line's line state information as "Private" and indicates the processor which now possesses the memory line.
The memory line recall/invalidate can take significant time to return/invalidate data. Meanwhile, new requests for the same memory line can be received by the memory controller. Retrying these new requests is messy in a large system because of the need to provide fairness and prevent starvation. Instead the new requests for that memory line could be queued as a linked list for that memory line in the memory controller. Once the recalled data or invalidate acknowledge is received, the memory controller services the requests for that memory line in the linked list in the order the requests were received. Multiple linked lists, one per recalled memory line, can exist at any time in the memory controller.
Generally linked lists are implemented using random access memory (RAM) structures. For the cache coherency mechanism described above, when a request for a memory line cannot be immediately satisfied, in response to a new request for the memory line, the memory controller after searching the directory for the memory line and discovering the memory line is unavailable, queues up the new request in a linked-list for the memory line. This generally includes creating a new entry for the new request, locating the end of the linked list and updating a next-entry pointer in the last entry in the linked list to point to the new entry. The new entry is then newly designated as the tail (last) of the linked list.
When the memory line becomes available, the memory controller will access the linked list for the first (head) entry and take the proper action. The pointers in the linked list will be updated to reflect the removal of the first entry in the linked list.
As is clear from the above description, every request will result in the memory controller accessing the directory for the memory line one or more times. A linked-list entry could also be created. If a line recall or invalidate is issued on behalf of the request, when the memory line is returned/invalidated, the memory controller again needs to search the linked list associated with the directory in order to complete the request. As a result, the search for the next entry must be efficient. Otherwise this can result in significant performance loss in many applications.