This invention relates in general to memory accesses in multi-node, multi-processor, cache coherent non-uniform memory access system and relates in particular to managing multiple requests in such a system.
A Scalable Coherent Interface (SCI) Based System coherency flow requires multiple memory accesses. Each access takes many cycles, and therefore, the entire flow takes a great deal of time. The bandwidth of the SCI based system, designed with only one outstanding request, is determined by the latency of each flow. Even though in this type of system, the wires themselves are rated at gigabytes per second, the actual useful bandwidth for each node is limited to closer to 30 to 40 megabytes per second. The reason for this, is that the existing system has enough resources in the SCI controller to handle only one request or response at a time.
Therefore, there is a need in the art for a method and system that will use more of the available bandwidth of the system by allowing the system to have more than one outstanding request.
This need and others are achieved in a system in which one embodiment has local storage for the cache line and tag, and a Contents Addressable Memory (CAM) for the cache line address, is used in the SCI controller to allow numerous outstanding requests or flows to be active at one time. All responses from the SCI ring that generate new SCI requests are handled in the controller without requiring additional memory accesses from the local memory. All conflicts with other SCI cache requests and outstanding flows are also handled by the controller.
One technical advantage of the present invention is to use a request activation queue to store a request until there are resources available on the SCI ring to handle the request.
Another technical advantage of the present invention is to use a response activation queue to hold a pointer to a CAM memory location and a table location, so that when the MAC has the required resources to handle the response, the response packet will be formed from the information in the response activation queue.
A further technical advantage of the present invention is to use a SCI table to store information identifying which memory locations already have outstanding access requests.
A further technical advantage of the present invention is to use a content addressable memory with match ports to check if a local or ring request is to access a memory location that already has an outstanding request or response.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and the specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.