The present invention concerns computing devices and pertains particularly to the formation of linked lists using content addressable memory.
Linked lists are useful constructs in many applications. One such application is ensuring cache coherency in a multiprocessor (MP) system. Organizing accesses to the same memory line in a linked list allows these requests to be serviced in the arrival order even in the presence of conflicts. This ensures fairness and prevents starvation problems that could occur when conflicts are resolved using retry or other methods.
A cache memory is a small, high-speed buffer memory which is used to hold temporarily those portions of the contents of main memory which it is believed will be used in the near future by a processor. The main purpose of a cache memory is to shorten the time necessary to perform memory accesses, either for data or instruction fetch. The information located in cache memory may be accessed in much less time than information located in main memory. Thus, a processor with a cache memory needs to spend far less time waiting for instructions and operands to be fetched and/or stored.
A cache memory is made up of many cache lines of one or more words of data. Each cache line has associated with it an address tag that uniquely identifies of which memory line of main memory the cache line is a copy. Each time the processor makes a memory reference, an address tag comparison is made to see if a copy of the requested data resides in the cache memory. If the desired memory line is not in the cache memory, the memory line is retrieved from the main memory, stored in the cache memory as a cache line and supplied to the processor.
In addition to using a cache memory to retrieve data from main memory, the processor may also write data into the cache memory instead of directly to the main memory. When the processor desires to write data to the memory, the cache memory makes an address tag comparison to see if the cache line into which data is to be written resides in the cache memory. If the cache line exists in the cache memory and is modified (dirty) or exclusive, the data is written into the cache line in the cache memory. In many systems a data xe2x80x9cdirty bitxe2x80x9d for the cache line is then set. The dirty bit indicates that data in the cache line is dirty (i.e., has been modified), and thus before the cache line is deleted from the cache memory the modified data must be written into main memory. If the cache line into which data is to be written does not exist in the cache memory, the cache/memory line must be fetched as exclusive into the cache memory or the data written directly into the main memory.
A shared-memory MP system has a potentially large number of processors, each with its own cache(s). When an access to memory is made in such a system, it is necessary to take steps to insure the integrity of data accessed. For example, when an entity reads data from memory, it is important to determine whether an updated version of the data resides in the cache of a processor on the system. If an updated version of the data exists, something must be done to insure that the entity accesses the updated version of the data. A mechanism that assures that the updated version of the data is utilized in a memory reference is referred to herein as a coherence mechanism.
The most common coherency mechanism is snooping, which, usually, requires the processors to share a bus. However, due to electrical reasons, only a limited number of processors can share a bus. Therefore, when the number of processors in an MP system is large, snooping can no longer be efficiently used for cache coherency.
The most common cache coherency mechanism for systems with a large number of processors is a directory structure in memory. Within the directory structure, line state information exists for each memory line within the memory. The line state information consists of a number of bits for each memory line. The bits for each memory line indicate, for that memory line, the state of the memory line (Private, Shared etc.), plus extra information relevant for that memory line state. When the memory line is held xe2x80x9cPrivatexe2x80x9d in a cache of a first processor, this means that the memory line is not available for use by other processors until released by the first processor, and that first processor is allowed to modify the contents of that memory line. When the memory line is held xe2x80x9cSharedxe2x80x9d in a cache of a first processor, this means that the memory line is available for use by other processors as long as the other processors do not want to hold the memory line as xe2x80x9cPrivatexe2x80x9d, and while the line is held xe2x80x9cSharedxe2x80x9d the contents of the line are not allowed to be modified.
When a processor desires to access a memory line, a request is sent to a memory controller for the memory line. The memory controller reads line state information for the memory line to determine the current state of the requested memory line. If the line state information bits for the requested memory line indicate that the memory line is held Private in another cache, the memory line is recalled to the memory controller. When the memory line comes back to the memory controller, the memory controller supplies the memory line to the requester, updates the memory line""s line state information and, if necessary, updates the data for the memory line in memory. If the memory line is requested as xe2x80x9cPrivatexe2x80x9d and the memory controller reads the line state information and finds the memory line is xe2x80x9cSharedxe2x80x9d, the memory controller invalidates copies of the memory line in other caches (as indicated by the line state information) and then supplies the memory line to the requester. The memory controller also tags the memory line""s line state information as xe2x80x9cPrivatexe2x80x9d and indicates the processor which now possesses the memory line.
The memory line recall/invalidate can take significant time to return/invalidate data. Meanwhile, new requests for the same memory line can be received by the memory controller. Retrying these new requests is messy in a large system because of the need to provide fairness and prevent starvation. Instead the new requests for that memory line could be queued as a linked list for that memory line in the memory controller. Once the recalled data or invalidate acknowledge is received, the memory controller services the requests for that memory line in the linked list in the order the requests were received. Multiple linked lists, one per recalled memory line, can exist at any time in the memory controller.
Generally linked lists are implemented using random access memory (RAM) structures. For the cache coherency mechanism described above, when a request for a memory line cannot be immediately satisfied, in response to a new request for the memory line, the memory controller after searching the directory for the memory line and discovering the memory line is unavailable, queues up the new request in a linked-list for the memory line. This generally includes creating a new entry for the new request, locating the end of the linked list and updating a next-entry pointer in the last entry in the linked list to point to the new entry. The new entry is then newly designated as the tail (last) of the linked list.
When the memory line becomes available, the memory controller will access the linked list for the first (head) entry and take the proper action. The pointers in the linked list will be updated to reflect the removal of the first entry in the linked list.
As is clear from the above description, every request will result in the memory controller accessing the directory for the memory line one or more times. A linked-list entry could also be created. If a line recall or invalidate is issued on behalf of the request, when the memory line is returned/invalidated, the memory controller again needs to search the linked list associated with the directory in order to complete the request. As a result, the search for the next entry must be efficient. Otherwise this can result in significant performance loss in many applications.
In accordance with the preferred embodiment of the present invention, a linked list structure in a computing system includes a first entry and additional entries. Each additional entry includes a link reference to a prior entry in the linked list. The link reference for each additional entry all are stored within a content addressable memory. For example, the link reference to the prior entry is the index of the prior entry within the content addressable memory. Each additional entry is accessible by performing a content search using the link reference to the prior entry within the next additional entry. That is, when the link reference to the prior entry is a link field which contains the index of the prior entry within the content addressable memory, the next additional entry is found by performing an associative search in the content addressable memory for an index of the entry in the linked list immediately prior to the next additional entry.
Thus, the linked list is traversed, for example, by accessing the first entry in the linked list. A second entry in the linked list is accessed by searching the content addressable memory for a reference to the first entry (e.g., using the index of the first entry within the content addressable memory). A third entry in the linked list is accessed by searching the content addressable memory for a reference to the second entry (e.g., using the index of the second entry within the content addressable memory). And so on.
In various embodiments of the present invention, the content addressable memory also stores for each entry a validity bit which indicates whether the entry is valid.
Various embodiments of the present invention may be tailored for particular applications. For example, in one embodiment, the linked list is used within a request queue in a memory controller used to access memory lines in a main memory. In a single queue embodiment, there is stored in the content addressable memory for the first entry and for each of the additional entries, a head field, a tail field, an address field, and a validity bit. The head field contains a value which is true for the first entry and which is false for each of the additional entries. The tail field contains a value which is true only for a last entry in the linked list. The address field contains an address for a memory line in the main memory. The validity bit indicates whether an entry is valid (in use). Additional information for each entry is stored in a random access memory addressed using the xe2x80x9cmatchxe2x80x9d bits from the corresponding CAM entry or from a normal index decoder. For example, the additional information includes an operation to be performed on the memory line. The additional information additionally may include a data field which, when valid, stores current data for the memory line.
In another embodiment of the present invention, the linked list is used within a two queue implementation of a request queue in a memory controller used to access memory lines in a main memory. In the two queue embodiment, for example, an address field and a validity bit for the first entry for each linked list are stored in a separate (head queue) content addressable memory. The head queue content addressable memory contains only the head entries of eventual linked-lists for memory lines. The address field contains an address for a memory line. The validity bit indicates whether the entry is valid. Additional information for the entry is stored in a random access memory.
Additional information for the first entry is stored in a random access memory. The additional information includes, for example, an operation to be performed on a memory line, a last field and a tail field. The last field contains a value which is true when the first entry is a last entry in the linked list. The tail field, when valid, contains an index for the last entry in the linked list. The additional information additionally may include a data field which, when valid, stores current data for the memory line.
There is stored in a second content addressable memory, for each of the additional entries, a head/link field. The head/link field contains a value which indicates whether the prior entry referenced by the link reference resides in the content addressable memory or in the second content addressable memory. Also entries in the second content addressable memory can contain a valid field and a link field.
The present invention allows for effective ways to implement a linked list which can significantly simplify access to the linked list for certain applications.