This invention relates generally to the field of multiprocessor systems, and more particularly, to a method and system for storing data at input/output (I/O) interfaces for a multiprocessor system.
Multiprocessor computers often include a large number of computer processors that may operate in parallel. Parallel processing computer architectures include cache-coherent multiprocessors with non-uniform memory access (NUMA) architecture. NUMA architecture refers to a multiprocessor system in which each processor has its own local memory that can also be accessed by the other processors in the system. NUMA architecture is non-uniform in that memory access times are faster for a processor accessing its own local memory than for a processor accessing memory local to another processor.
In order to maintain cache coherence and protect memory pages from unauthorized access, a protection scheme is generally used to enable or disable shared access to a memory page. A memory page may include data, as well as a directory for tracking states associated with cache lines for the memory page. Conventional memory protection schemes utilize memory protection codes to indicate whether a particular element may access the memory page.
For non-shared access to a cache line, the memory protection code simply has to track the single element with access to the cache line. However, for shared access to a cache line, the memory protection code has to track all the elements with access to the cache line in order to notify those elements when their copies of the cache line have been invalidated. Thus, for a memory protection code of a specific size, a fixed number of elements may be tracked, limiting the number of elements that may share access to a cache line.
Conventional systems have attempted to solve this problem by using aliased elements. This approach has the memory protection code tracking a number of elements together such that when one element has shared access to a cache line, the memory protection code indicates that multiple elements have shared copies of the cache line. However, as the number of aliased elements increases, the efficiency of the system is reduced in that a greater number of elements that are not actually storing a copy of the cache line must be notified of modifications to the cache line.
Efficiency is further reduced by data caching at input/output (I/O) elements of the system. Because such data is inherently unreliable, validity messages must be transmitted back and forth between the memory storing the data and the I/O element caching a copy of the data. Transmitting these messages consumes available bandwidth. Attempting to solve this problem by tracking I/O elements, in addition to processors, with the memory protection code increases the problem of aliasing caused by the limited size of a memory protection code.
The present invention provides a method and system for storing data at input/output interfaces for a multiprocessor system that significantly eliminate or reduce problems and disadvantages associated with previous systems and methods. In particular, copies of system data are stored at the I/O interfaces in an exclusive read-only state to provide I/O caching with minimal memory management resources.
In accordance with a particular embodiment of the present invention, a multiprocessor system and method includes a processing sub-system having a plurality of processors and a processor memory system. A network is operable to couple the processing sub-system to an input/output (I/O) sub-system. The I/O sub-system includes a plurality of I/O interfaces each operable to couple a peripheral device to the multiprocessor system. The I/O interfaces each include a local memory operable to store exclusive read-only copies of data from the processor memory system for use by a corresponding peripheral device.
More specifically, in accordance with a particular embodiment of the present invention, the processor memory system includes a directory operable to identify data having an exclusive read-only copy stored in the I/O sub-system. In this and other embodiments, the processor memory system is operable to invalidate an exclusive read-only copy of data in the I/O sub-system in response to a request for the data by a processor.
Technical advantages of the present invention include providing an improved multiprocessor system. In particular, the multiprocessor system utilizes a distributed shared memory with peer I/O. As a result, peripheral devices can intelligently pre-fetch and store data from the multiprocessor system.
Another technical advantage of the present invention includes providing an improved method and system for storing data at input/output interfaces of a multiprocessor system. In particular, data is stored at the I/O interfaces in an exclusive read-only state to allow I/O caching without use of a sharing vector or the need for write-backs. Accordingly, I/O caching is provided with minimal memory resources.
Other technical advantages of the present invention will be readily apparent to one skilled in the art for the following figures, description, and claims.