1. Technical Field
This application relates generally to managing I/O operations in multi-core computer systems.
2. Description of Related Art
Digital computer systems typically include processing subsystems and memory subsystems, the latter for storing data and sequences of instructions, the former for operating upon the data as directed by a sequence of instructions (such a sequence being known as a “program”).
Advances in hardware design have resulted in “multiprocessor” systems and “distributed” systems, wherein a plurality of intelligent entities (CPU's, I/O channels, etc.) may operate simultaneously, often sharing some of the data in memory and cooperatively updating that data. Similarly, advances in software design have culminated in “multiprogram” or “multiprocess” systems, wherein a single intelligent hardware unit may host a plurality of programs, operating independently of each other, all of which may operate simultaneously, possibly sharing and cooperatively updating data.
One of the problems that had to be overcome to realize such configurations was the coordination of data sharing—for example, preventing two or more processes from attempting to update the same data at the same time. A typical scenario in which difficulty might arise might be: 1. A first process reads a location, and calculates based on what it read a new contents for that location, which it intends to store in that location; 2. Before the first process can write those results, a second process reads the same location, and calculates a new contents for that location, which it intends to store; 3. The first process stores the new contents it calculated; 4. The second process stores the new contents it calculated.
The result stored by the second process is probably incorrect, because it is based on obsolete data—data which the second process had no way of knowing was already being updated by the first process.
A solution to this problem has been to define an “atomic” or “indivisible” operation for performing such data modification, in which no intermediate results of the atomic operation are externally visible—when one process is performing such an operation, no other processes can access the data until the operation is completed.
The prior-art implementation of this solution has been to “lock” the entire memory for the duration of an atomic operation, meaning that any request by another process to use the memory had to be held pending until the atomic operation was completed. This has a deleterious effect on the efficiency of the system. A refinement to this basic solution is to lock something less than the entire portion of the memory, usually the physical “block” or “page” containing the location upon which the atomic operation is being performed. Since such a portion is typically several thousand locations or more, this results in pending a significant portion of the other processes contending for memory access and is thus not a complete solution to the problem.