Computer systems, in general, include a main memory (also known as the computer's "physical memory") for storing data and instructions of currently executing programs ("process threads"). Typically, the main memory is organized as a plurality of sequentially numbered storage units, each containing a fixed size quantity (e.g. an 8-bit byte in byte oriented computers). The numbering of the storage units (typically in binary or hexadecimal values starting from zero up to the total number of storage units minus one) serve as addresses by which a particular storage unit can be referenced for reading or writing the data contained therein. The set of numbers by which the storage units are addressed is known as the "physical address space" of the main memory. Main memory typically is realized using semiconductor memory which provides fast, random-access to the various storage units, but requires constant application of electrical energy for operation (i.e. the memory is volatile).
Computer systems also typically provide one or more secondary storage devices which are generally slower than the main memory, but have a much greater storage capacity than the main memory. The secondary storage devices typically store data on a magnetic or optical media that is non-volatile, such as a hard disk. Secondary storage devices generally store data in the form of files or sequential data streams.
Due to the greater speed at which data can be accessed in main memory, data that is currently in use by process threads running on the computer system is desirably stored in the main memory. Due to the smaller storage capacity of the main memory, however, main memory may be unable to store all the information needed by process threads. Accordingly, data that is no longer currently in use is desirably removed from the main memory, or moved from the main memory to the secondary storage devices.
Techniques to efficiently manage the use of the main memory ("memory management techniques") by process threads are conventionally known. One standard technique, commonly known as "virtual memory," is implemented by many operating systems, usually in cooperation with a computer system's processor. Virtual memory techniques create a separate address space, referred to as the "virtual address space," by which process threads access data in memory. The operating system and processor translates or maps a subset of the virtual addresses in the virtual address space to actual physical addresses in the main memory's physical address space. When a process thread reads or writes data to a virtual address in its virtual address space, the operating system and/or processor translates the virtual address to a corresponding physical address of a storage unit in the main memory where the data is to be read or written. In a first version (version 3.1) of Microsoft Corporation's Windows NT operating system (Windows NT 3.1), for example, a component called the virtual memory manager implements a separate virtual address space for each process thread in cooperation with the computer's processor.
Since the virtual address space is typically much larger than the physical address space of the main memory, only a subset of the virtual address space can be resident in main memory at one time. Data not resident in main memory is temporarily stored in a "backing store" or "paging" file on the computer's hard disk. When the main memory becomes over committed (i.e. its storage capacity is exceeded), the operating system begins swapping some of the contents of the main memory to the "backing store" file. When the data is again required by a process thread, the operating system transfers the data back into the main memory from the backing store file. By swapping data that is no longer needed to the hard disk, virtual memory allows programmers to create and run programs that require more storage capacity than is available in the main memory alone.
Moving data between the main memory and the hard disk is most efficiently performed in larger size blocks (as compared to bytes or words). Accordingly, virtual memory techniques generally perform swapping in large size blocks. Microsoft Corporation's Windows NT 3.1 operating system, for example, divides the virtual address space of each process thread into equal size blocks referred to as "pages." The main memory also is divided into similar size blocks called "page frames," which contain the pages mapped into the main memory. The page size in the Windows NT 3.1 operating system is 4 KB, 8 KB, 16 KB, 32 KB, or 64 KB, depending on the requirements of the particular computer on which it is run.
In the Windows NT 3.1 operating system, each process has a set of pages from its virtual address space that are present in physical memory at any given time. Pages that are currently in the main memory and immediately available are termed "valid pages." Pages that are stored on disk (or in memory but not immediately available) are called "invalid pages." When an executing thread accesses a virtual address in a page marked invalid, the processor issues a system trap called a "page fault." The operating system then locates the required page on the hard disk and loads it into a free page frame in the main memory. When the number of available page frames runs low, the virtual memory system selects page frames to free and copies their contents to the hard disk. This activity, known as "paging," is imperceptible to the programmer.
A technique similar to virtual memory, known as mapped file input/output (I/O), can be used to access data in files stored in secondary storage devices. Mapped file I/O refers to the ability to view a file residing on disk as part of a process thread's virtual memory (i.e. normal files other than the paging file are mapped into a portion of the process thread's virtual memory). A process thread using mapped file I/O accesses the file as a large array in its virtual memory instead of buffering data or performing disk I/O. The process thread performs memory accesses using virtual addresses to read the file, and the operating system uses its paging mechanism to load the correct page from the disk file. If the application writes to the portion of its virtual address space which is mapped to the file, the operating system writes the changes back to the file as part of normal paging. Because writing to the main memory is generally much faster than writing to the secondary storage devices, mapped file I/O potentially speeds the execution of applications that perform a lot of file I/O or that access portions of many different files.
The Windows NT 3.1 operating system includes a component, the cache manager, which uses mapped file I/O to administer a memory-based cache. The cache manager places frequently accessed file data in memory in order to provide better response time for I/O bound programs. When a process thread opens and uses a file, the file system notifies the cache manager which maps the file to virtual memory pages using the functionality of the virtual memory manager. As the process thread uses the file, the virtual memory manager brings accessed pages of the file from the hard disk into the main memory. During paging, the virtual memory manager flushes written pages back to the file on the hard disk. This cache can vary in size depending on how much memory is available. The paging functionality of the virtual memory manager automatically expands the size of the cache based on conventional working set techniques when plenty of memory is available, and shrinks the cache when it needs free pages.
The advantage of mapped file I/O is that entire sections of the hard disk containing a mapped file can be materialized into memory at a time, and thereby made available to an I/O system (an operating system component that manages device I/O). Also, process threads of user applications that map files see the same pages of file data that the I/O system is using, so data coherency (i.e., multiple accessors always seeing the current data) between the I/O system and user mapped files is automatic.
A potential problem arises when files are created or extended in the cache. In particular, as new pages of a file are written for the first time, there is no initial data resident on disk for these new pages, so the I/O system logically selects free pages in memory to receive the new file data prior to writing it to disk. Yet, as the I/O system adds these pages to a file, they become immediately visible to applications via mapped file access. This poses a potential security risk if the user applications are allowed to see data that previously existed in these free memory pages. (Free memory pages contain data from the last time the pages were used, which could be data from another file, private system data, system code, etc.) If free pages are made immediately visible in a file prior to being completely written with new data, then the previous data in the page becomes visible to all accessors including user applications via mapped file access.
One way to fix this problem is for the I/O system to insure that each page is zeroed out before it becomes visible in the file. However, this is very expensive, as all pages added to a file would then have to be written twice, once with zeros and once as they receive new file data.
In the first version of the Windows NT operating system, Version 3.1, user applications were, in fact, prevented from accessing uninitialized data in the cache by having the I/O system prezero new pages before allowing them to become visible for mapped file access. As mentioned, this hurt performance by causing the pages to be written twice. Zeroing particularly affects I/O performance when writing small files. (This problem therefore is sometimes referred to as the small file writes problem.) When writing an 8-byte long file, for example, the Windows NT 3.1 operating system first writes 4 KB of zeroes over a mapped page, then the 8-bytes of file data (i.e. a total of 4,104 bytes). This is an increase of 513 times as many bytes as the file data alone, for the first write to a new page.