A computer system stores data in its memory. In order to do useful work, the computer system operates on and performs manipulations against this data. Ideally, a computer system would have a singular, indefinitely large and very fast memory, in which any particular data would be immediately available to the computer system. In practice this has not been possible because memory that is very fast is also very expensive.
Thus, computers typically have a hierarchy (or levels) of memory, each level of which has greater capacity than the preceding level but which is also slower with a less expensive per-unit cost. These levels of the hierarchy may form a subset of one another, that is, all data in one level may also be found in the level below, and all data in that lower level may be found in the one below it, and so on until we reach the bottom of the hierarchy. In order to minimize the performance penalty that the hierarchical memory structure introduces, it is desirable to store the most-frequently-used data in the fastest memory and the least-frequently-used data in the slowest memory.
For example, a computer system might contain:
1) a very small, very fast, and very expensive cache that contains the most-frequently-used data;
2) a small, fast, and moderately expensive RAM (Random Access Memory) that contains all the data in the cache plus the next most-frequently-used data; and
3) several large, slow, inexpensive disk drives that contain all the data in the computer system.
When the computer system needs a piece of data, it looks first in the cache. If the data is not in the cache, the computer system retrieves the data from a lower level of memory, such as RAM or a disk drive, and places the data in the cache. If the cache is already full of data, the computer system must determine which data to remove from the cache in order to make room for the data currently needed. For efficiency, data may be moved between levels of storage in units called pages. The process of moving data between levels is called paging.
The algorithm used to select which page is moved back through the levels is called the replacement algorithm. Often, a “least-recently-used” algorithm is used to govern movement of pages. That is, pages that have not recently been referenced are replaced first. Thus, if a page is not used for an extended period of time, it will migrate through the storage hierarchy to the slowest level. Hence, the most-recently-used data is contained in high-speed, main storage ready for immediate access, while less-frequently-used data migrates though the storage hierarchy toward the slower-speed storage, often called secondary storage.
The least-recently-used algorithm is acceptable for many paging situations. Situations arise, however, in which certain data on certain pages must be available for immediate access in main storage independent of the usage history of the pages. Access to these pages may be required as a result of a reference made by either the processor or by an I/O (Input/Output) device or a network. For example, data buffers for certain high speed I/O devices or networks must be located in main storage.
A technique for insuring the presence of a page in main storage is called pinning. When a page is pinned, an area in main storage is reserved for the page, and the page is not permitted to migrate to secondary storage. Any attempt to replace a page pinned in this reserved storage is blocked. Pinning memory is expensive because it leaves less memory available for other system operations, which results in decreased performance because of increasing paging activity. Thus, it is desirable to only pin pages for the shortest duration possible so as not to decrease performance and to reduce the amount of system memory needed to transfer data to/from the I/O device or network.
Yet, current systems must keep pages pinned for an extended period of time, as can be seen in the following example of a sequence of actions that a typical system might take to perform a data transmission:
a. The system pins memory locations that contain data to be transferred.
b. The system builds and sends a command to an adapter that requests data to be transferred.
c. The adapter fetches the data and places it into a buffer in the adapter.
d. The adapter transmits the data across a network.
e. The adapter waits for and receives an acknowledgement across the network.
f. The adapter sends a response to the system indicating the command completed.
g. The system releases the pinned memory.
In order to increase performance and decrease the amount of memory needed to transfer data, what is needed is a way to decrease the amount of time that pages remain pinned. Although the aforementioned performance problems have been described in the context of a system pinning memory locations, they can apply in any scenario where performance is an issue.