Traditionally, computer memories were divided into two categories: read-only memory (ROM) and random-access memory (RAM). ROM is the semi-conductor based memory that contains information that can be read by the microprocessor or other hardware devices but not modified. ROM is generally used to store programs for the computer or instructions that the computer needs in order to perform essential tasks. For example, in IBM-compatible computers, the built-in instructions that form the basic input-output system (BIOS) are stored in ROM.
RAM, on the other hand, is generally volatile memory, i.e., it can be read and written by the microprocessor or other hardware devices. A portion of RAM is used to hold the operating system and the programs and files that the user is actively working with. Other portions of RAM are reserved for the system's own use, including the instructions which access the ROM BIOS chips. The term RAM is often used synonymously with the terms "physical memory" or "primary memory" to refer to the memory actually present in a computer system.
The important functional characteristic of RAM is the ability to randomly access any part of the memory in equal time. That is, the time required to obtain information from one memory location is generally the same as that required to obtain it from any other.
Because the amount of RAM in a computer is limited, "secondary storage" devices such as magnetic tapes and hard disks developed in order to store larger amounts of data. In contrast to RAM, the storage area in a secondary storage device is not directly accessible by the processor. Instead, the storage area is accessible only by input/output (I/O) operations, which is much slower than the time required to directly access data in RAM. Furthermore, although data may reside in a secondary storage device, it can be processed only when it resides in RAM.
Because the processor can access RAM directly and very quickly, it would be desirable to store all of the computer's data in RAM. Unfortunately, there is a finite amount of space in RAM, and computers often do not have enough RAM to hold everything that is needed. Some older computers have as little as 640K of RAM, or even less, and are therefore unable to hold many large programs and files. Even modem computers that have as much as 256 MB of RAM often do not have the capacity to hold several different application programs at one time in addition to some of the data that is related to those programs.
An obvious solution to this dilemma would be to simply add more and more RAM to the computer. Generally speaking, the more RAM a computer has, the more information and data it can actively work with at one time. RAM, however, is a fairly expensive commodity, and therefore it is not always economically feasible to expand the amount of RAM in a computer. Furthermore, due to hardware constraints, there is always a practical limit to the amount of RAM a computer can use or address.
As software technologies improved, yet another type of storage developed known as "virtual memory". In general, virtual memory provides the illusion that there is a greater amount of RAM than is actually installed in the computer by treating part of a secondary storage device, such as a hard disk, as if it were also RAM. The same part of the computer's operating system that puts programs and data into RAM treats the virtual memory portion of the secondary storage device exactly as if it were RAM.
More specifically, virtual memory extends the amount of memory that the operating system is capable of addressing to take into account the total amount of memory--actual and virtual--available to the system. As is well known to those skilled in the art, the number of addressable words accessible by a computer depends on the number of bits in its address field and is unrelated to the number of memory words actually available. For example, a hypothetical computer having a 16-bit address field can theoretically address 65,536 (64K) words of memory. However, if only 4096 (4K) words of RAM are provided, the addressing capability of the computer is not being fully exploited.
As used herein, the term "address space" represents the number of words a particular computer can address utilizing all bits of the address field provided in the computer architecture. In the hypothetical 16-bit computer referenced above, the address space comprises the numbers 0, 1, 2, . . . 65,535, the set of possible addresses.
In virtual memory, applications access memory through virtual addresses, which are mapped by special hardware onto physical addresses, by translating addresses in the address space into physical memory (RAM) locations. For example, in the hypothetical computer referenced above having 4K of physical memory (RAM), a "map" may be created for containing information which relates addresses generated in the 64K address space of the computer to addresses in the 4K physical memory (RAM). Procedures have been developed to carry out such maps, and are well known in the art.
Fundamental to the operation of virtual memory is the concept of "swapping", which is also referred to as "paging". Paging is a technique developed to provide the mapping of a larger address space to a smaller physical memory. Paging divides the virtual address space into fixed-sized blocks called "pages", each of which can be mapped onto any physical addresses available on the system. Within a computer, paging occurs when different pages of programs or data are moved between physical memory (RAM) and a secondary storage device.
In a conventional virtual memory implementation, paging occurs after a "page fault", i.e., when a program has accessed a virtual memory location that is not currently in RAM. A page that has not been recently accessed is "paged out" from RAM to the secondary storage device, and the page needed by the faulting program is "paged in." The mapping hardware is notified of the new physical address of the page, and the instruction that caused the page fault is restarted.
In the "WINDOWS" operating system, developed by Microsoft Corporation, the assignee of the present invention, virtual memory is implemented by the creation of a "paging file" on the secondary storage device. The paging file temporarily stores the pages of programs or data when they are not actively in use. Thus, when RAM gets full, the operating system can move pages of programs or data files into the paging file for temporary storage, freeing up space in RAM for new files and programs. In these systems, the term "virtual memory" is often used to refer both to the process by which data is swapped between RAM and the secondary storage device, as well as to the combination of RAM and the paging file.
Those skilled in the art will recognize that it is irrelevant whether a piece of data is stored in the virtual memory portion of RAM or in the virtual memory portion of the secondary storage device, i.e., the paging file. At any particular time, each individual page in the virtual address space might be referring to data that is in RAM or in the paging file, depending on how recently it has been used. Because a program can only directly access data that is in RAM, an application will be able to gain immediate access to any piece of data stored in RAM. On the other hand, if that piece of data is stored in the paging file, the operating system will read the data from the secondary storage device into a section of RAM. If that section of RAM is currently occupied by some other piece of data, then certain data will have to be written back to the secondary storage device.
There are many different ways that an operating system may decide which data should be swapped between RAM and the secondary storage device. The most commonly used methods involve "least recently used" algorithms. That is, the operating system keeps track of which applications and data in memory have been least recently accessed and makes them prime candidates for moving to the disk if more RAM is needed for some reason. For example, if the operating system determines that there is not enough RAM to load a particular program or other data, it relocates the least recently used information from RAM into the paging file, and then loads the requested program or data into the newly vacated space in RAM.
In addition to virtual memory, there is also a second competing use of the finite amount of RAM in a computer: disk cache. A disk cache is generally used to compensate for the slowness of disk drives. Regardless of how fast a disk drive may be, its many mechanical parts make it extremely slow to access compared to the speed of a RAM chip which moves data at the speed of electricity.
The concept behind a disk cache is to speed up the computer's operations by keeping in RAM the data that the programs are most likely to request from the disk drive. Then if a program needs that piece of data again while it is still in RAM, then the program can access that information directly from the disk cache, without having to read from the disk drive. Thus, disk cache refers to the portion of a computer's RAM set aside for temporarily holding information read from or written to a disk drive.
A disk cache operates by intercepting a data request sent from an application or the operating system to the disk drive. The disk cache reads the data from the disk drive, but in addition to the requested data, it may also retrieve more data, typically from adjacent clusters, i.e., units of space on the disk drive. The disk cache passes along the requested data to the application or operating system, but stores a copy of it, along with any excess data also retrieved, in a portion of RAM reserved for the disk cache.
During the time in which the CPU is not actively engaged in processing instructions, the disk cache may take control to read still more data from the disk drive, which the disk cache also stores in RAM, usually from clusters near the files that have already been read. Some disk caches have a built-in logic that makes intelligent guesses about which clusters are more likely to be requested later by the application. The intelligence of this logic distinguishes one disk cache's efficiency from another's.
When the application or operating system later requests more data, for example, after a page fault, the disk cache again intercepts the request and checks to see if the requested data is already stored in RAM. If it is, the disk cache supplies the data directly to the application or operating system without having to access the disk drive. Therefore, access time is considerably faster than if the program must wait for the disk drive mechanism to fetch the information from the disk.
If, on the other hand, the data is not already stored in RAM, the disk cache repeats the earlier process, retrieving the new data, supplying it to the application or operating system, and also storing it in RAM along with extra clusters from the disk drive. As the RAM used by the disk cache fills up, the disk cache releases the data that has been in the buffer the longest without being used and replaces it with data retrieved during more recent disk accesses.
When a program issues a command to save data to disk, some disk caches intercept the data and defer writing it to the disk drive until the CPU is otherwise idle. This speeds up computer operations because the CPU's attention is not divided between writing to the disk drive and other processing. If the file to be written to disk is still held in the area of RAM reserved for the disk cache, then the disk cache writes to disk only the clusters that have been changed. Some disk caches also hold pending writes and perform them in an order that minimizes the movements of the disk drive's read/write heads.
Therefore, it will be appreciated that the finite amount of RAM in a computer must be allocated between virtual memory and disk cache. On the one hand, the larger the portion of RAM allocated to the disk cache, the more data from the secondary storage device, or disk drive, that can be kept in RAM rather than on the secondary storage device, and the faster the access to that data will be. On the other hand, the larger the portion of RAM allocated to disk cache, the less RAM available for virtual memory. Thus, the operating system will have to do more paging of data between the virtual memory portion of RAM and the paging file.
At any particular time, there will be a certain allocation of RAM whereby the "performance level" of the computer system is optimized by devoting a certain amount of RAM to disk cache and a certain amount of RAM to virtual memory. The "performance level" of the computer is related to the number of times that the computer accesses the disk in a particular time period, and thus is a measure of the processing speed at which the computer is currently operating. The optimal amount of RAM to devote to either virtual memory or disk cache is determined by the size of the "working set" of the data. The term "working set" refers to the data that is currently being frequently accessed. Thus, the particular allocation of physical memory should depend on the particular operation being performed and whether or not the disk is being heavily accessed.
An improper allocation of physical memory can cause a significant slowdown in the operation of the system, i.e., a reduction in the "performance level" of the computer. For example, consider another hypothetical computer system having eight pages of RAM, five of which are set aside for virtual memory, and the other three set aside for disk cache. A program that has a "working set" of six pages of virtual memory will only be able to store a maximum of five pages in RAM at any one time. Thus, one additional page will always have to be stored in the paging file. Because the program is constantly accessing these six pages, however, the operating system will have to be constantly switching five of those six pages that are in RAM. It will be appreciated that this is a time consuming process that significantly slows down the operation of the program. Thus, performance may be significantly improved by allocating one additional page from disk cache to virtual memory such that all six pages of that working set will fit into the virtual memory portion of RAM.
On the other hand, consider a computer having three pages of RAM allocated to disk cache and running a program that is searching a database that is four pages large. In this case, only three pages of data will be able to be stored in disk cache at any one time. Therefore, the database will be required to repeatedly access the secondary storage device in order to do repeated searches, causing a significant slowdown in the operation of the system. However, if one additional page of RAM were allocated to disk cache, then the entire database could be stored in the disk cache portion of RAM. This would allow the database to do repeated searches much faster because it would not have to access the secondary storage device.
In most traditional computer architectures, the amount of physical memory set aside for disk cache is fixed, and, therefore, they do not allow for the allocation of physical memory to be dynamically changed.
Some prior operating systems do have mechanisms for dynamically moving memory from disk cache to virtual memory. The "WINDOWS NT" operating system, developed by Microsoft Corporation, dynamically adjusts limits on the size of disk cache and virtual memory based upon the present memory requirements of computer operations. In particular, these systems link all memory, whether in the disk cache or in the virtual memory, into a single pool and track the most recently used and least recently used data sets for computer operations. Least recently used data is always used to make room for more frequently used data within the fixed amount of allocated RAM for disk cache or virtual memory.
For example, if a program is accessing virtual memory more frequently and disk cache less frequently, then the virtual memory will tend to grow because the disk cache pages will be the least recently used. The opposite is also true. That is, if a program is accessing disk cache more frequently than virtual memory, then the disk cache will grow because the virtual memory pages are the least recently used. Thus, the size of the disk cache or virtual memory is adjusted to fit the requirements of the most recently used data.
These prior systems change the allocation of RAM based merely on the level of activity in virtual memory or the disk cache, i.e., based upon all accesses to virtual memory or the disk cache. The prior art systems do not change the allocation of the RAM with respect to the working set, i.e., based upon accesses to the least recently used pages of virtual memory or the most recently discarded pages from disk cache. Consequently, there are certain situations in which these systems will change the allocation of physical memory between virtual memory and disk cache even though doing so does not improve the performance level of the computer system, i.e., it does not reduce the number of real disk accesses.
For example, the working set may be larger than the available RAM set aside for use as disk cache or virtual memory. Despite this large working set size, these prior art systems will attempt to expand memory and to accommodate the large working set size by discarding least frequently used data in favor of more frequently used data. This will lead to a cycle of discarding from RAM data classified as "least frequently used" that may be useful for the operating system or the user's present program. This defeats the purpose of using physical memory to obtain faster data access times and leads to the inefficient use of a scarce memory resource.
Consequently, there is a need for a computer system that dynamically changes the allocation of its physical memory only when doing so will improve the performance of the system.
Furthermore, there is a need for a computer system that detects when changing the allocation of its physical memory will cause a net reduction in the number of real disk accesses, and in response to such a detection, actually changes the allocation of physical memory.