1. Field of the Invention
This invention relates to the field of computer systems and, more particularly, to input/output (I/O) operations in multiprocessor systems.
2. Description of the Related Art
Processing associated with an I/O operation in a computer system may logically be divided into two parts. A first part may include preparation and dispatch of a device-level I/O request to a hardware I/O device in response to a read or write request from an application. A second part may include receiving and processing a response from the hardware I/O device and returning a completion indication to the application. The first part may be termed “request processing”, while the second part may be termed “response processing”. Other terms such as “top half” processing or “kernel context” processing may also be used to describe part or all of the first part of the processing in operating systems literature, while terms such as “bottom half” processing or “interrupt context” processing may also be used to describe part or all of the second part of the processing.
Several layers of an operating system may be traversed during both request processing and response processing. Various data structures may be accessed at each layer. For example, an application thread or process may invoke a system call for the read or write request. A file system may translate the read or write system call to a block address within a block device, and may prepare a first data structure (such as a block request header or a “buf” structure in some UNIX™-like operating systems) including a pointer to a buffer for storing the data corresponding to the I/O request. The first data structure may also be used for storing the state of the request, for specifying various parameters for the request (e.g., whether direct I/O to a user buffer is being requested), and for specifying a routine to be invoked when the I/O response is received (which may be termed an “I/O Done” routine). A pointer to the first data structure may then be passed to one or more additional layers of the operating system (such as a volume manager layer), each of which may in turn reference additional data structures including layer-specific information. Eventually (i.e., either by the file system or by some other intermediate layer), a device driver entry point (such as a “strategy” routine in some UNIX™-like operating systems) may be invoked. The device driver entry point may interpret the contents of the first data structure (and/or additional data structures) to prepare a device-level, I/O protocol-specific request descriptor for the I/O request, and enqueue or issue a device-level request for the I/O hardware.
Once the I/O hardware has performed the device-level I/O operation, the I/O hardware may generate a response such as an interrupt signal. An interrupt handler within the operating system may begin response processing. The layers of the operating system traversed during request processing may then be traversed in reverse order, with the various data structures being referenced (i.e., accessed and/or updated) at each corresponding layer. In some cases (e.g., for some read operations), one or more buffers filled by the I/O device may be copied to other buffers, such as a user-level buffer. Response processing may also include cleaning up certain data structures (e.g., by deallocating the data structures or by returning the data structures to a free pool) that may have been used during request processing.
In a uniprocessor system, both request processing and response processing are performed (by necessity) at the same processor. If the uniprocessor system includes a cache hierarchy (e.g., one or more levels of data cache, instruction cache or combined data and instruction cache), the various data structures described above may be brought into the cache hierarchy during request processing. A subset or all of the data structures may remain in the cache hierarchy when response processing occurs, especially if the device-level I/O operation is completed in a relatively short time. Therefore, response processing may benefit from the presence of the data structures in the cache hierarchy (i.e., relatively expensive memory accesses may be avoided during references to the data structures).
In multi-processor systems, on the other hand, request and response processing may be handled by different processors. For example, a first processor may receive the application's I/O request and perform the request processing, but a second processor may receive the interrupt signal and perform the response processing. In such cases, the data structures that may be referenced during response processing may remain in a cache hierarchy at the first processor, and may not be present in a cache hierarchy at the second processor at the time response processing begins. The second processor may therefore encounter cache misses during response processing, which may require data to be transferred between main memory and the cache hierarchy of the second processor or between cache hierarchies of the two processors. Such cache misses may result in decreased system efficiency. A mechanism to reduce the likelihood of such cache misses may therefore be desirable.