1. Technical Field
The present invention generally relates to improved distributed computing systems and in particular to improved memory management in distributed computing systems. Still more particularly, the present invention relates to a method, system, and program product for improving memory page sharing in a distributed computing environment.
2. Description of the Related Art
Multiprocessor computer systems are well known in the art, and provide for increased processing capability by allowing processing tasks to be divided among several different system processors. In conventional systems, each processor is able to access all of the system resources; i.e., all of the system resources, such as memory and I/O devices, are shared between all of the system processors. Typically, some parts of a system resource may be partitioned between processors, e.g., while each processor will be able to access a shared memory, this memory is divided such that each processor has its own workspace.
More recently, symmetric multiprocessor (SMP) systems have been partitioned to behave as multiple independent computer systems. For example, a single system having eight processors might be configured to treat each of the eight processors (or multiple groups of one or more processors) as a separate system for processing purposes. Each of these xe2x80x9cvirtualxe2x80x9d systems would have its own copy of the operating system, and may then be independently assigned tasks, or may operate together as a processing cluster, which provides for both high-speed processing and improved reliability. Typically, in a multiprocessor system, there is also a xe2x80x9cservicexe2x80x9d processor, which manages the startup and operation of the overall system, including system configuration and data routing on shared buses and devices, to and from specific processors.
Typically, when an SMP system is divided into multiple virtual systems, each of the virtual systems has its own copy of the operating system, and the same operating system is used for each virtual system. Since each processor is running the same operating system, it is relatively easy to provide for resource allocation among the processors.
The name xe2x80x9cmultiprocessorxe2x80x9d is used to connote a parallel computer with a xe2x80x9cshared common memoryxe2x80x9d; the name xe2x80x9cmulticomputerxe2x80x9d is used to connote a parallel computer with an xe2x80x9cunshared distributed memoriesxe2x80x9d or NO Remote Memory Access (NORMA).
Shared memory multiprocessors (often termed as xe2x80x9ctightly coupled computersxe2x80x9d) are further classified into three categories: UMA, NUMA, and COMA. UMA machines feature xe2x80x9cUniform Memory Accessxe2x80x9d, which implies that the latency for a memory access is uniform for all processors. Alternately, NUMA machines feature xe2x80x9cNon-Uniform Memory Accessxe2x80x9d, which implies that the latency for a memory access depends on the identity of the xe2x80x9clocationxe2x80x9d of the processor and memory. Notice that a portion of the global shared memory of a NUMA machine may be uniformly accessible (i.e. part of a NUMA may be UMA). There are several memory organizations possible for NUMA machines. The most common is a distributed global memory, in which each processor maintains locally a xe2x80x9cpiecexe2x80x9d of that memory. Access to the xe2x80x9clocal memoryxe2x80x9d is quite fast whereas access to xe2x80x9cremote memoryxe2x80x9d (maintained by some other processor) is much slower (typically 2 orders of magnitude slower), as it requires navigation through a communication network of some sort. In addition to local memory, a NUMA machine may have a cache memory. If the collective size of the local cache memory of all processors is big enough, it may be possible to dispense with main memory altogether. This results in a COMA (Cache-Only Memory Access) machine (a.k.a. ALLCACHE machines).
UMA/NUMA/COMA multiprocessor machines are further classified as being either symmetric or asymmetric. A symmetric multiprocessor gives all processors xe2x80x9cequal accessxe2x80x9d to the devices (e.g. disks, I/O) in the system; an asymmetric multiprocessor does not. In a symmetric system, executive programs (e.g. OS kernel) may be invoked on any processor.
Non-uniform memory access (NUMA) is a method of configuring a cluster of microprocessors in a multiprocessing system so that they can share memory locally, improving performance and the ability of the system to be expanded. NUMA is used in a symmetric multiprocessing (SMP) system. Ordinarily, a limitation of SMP is that as microprocessors are added, the shared bus or data path get overloaded and becomes a performance bottleneck. NUMA adds an intermediate level of memory shared among a few microprocessors so that all data accesses don""t have to travel on the main bus. To an application program running in an SMP system, all the individual processor memories look like a single memory.
There are two outstanding problems with Non-Uniform Memory Access (NUMA) computers, latency and coherency. Both of these problems are magnified when false sharing occurs.
In a distributed computing environment, including multiprocessor computers, each CPU has its own physical memory and cannot directly see the physical memory of another CPU. The virtual address space, or virtual memory, of the distributed environment is distributed across the physical memory of the CPUs which are participating in the environment. A CPU can claim ownership of an address range (typically the machine page size, such as 4 Kilobytes), which we will call a xe2x80x9cpagexe2x80x9d, and that portion of the virtual address range is sent to that CPU for storage in it""s physical memory. Thus, only one CPU can view the contents of a particular page of physical memory at any time.
For example, if the requesting CPU only needs to access the first 512 bytes of a 4 Kilobyte page it must still retrieve and claim ownership of the entire 4 Kilobyte page.
This introduces the problem of xe2x80x9cFalse Sharingxe2x80x9d, wherein multiple processors each require access to the same block simultaneously, even if they actually access unrelated parts of that block. In this example, the CPU has claimed 4 Kilobytes of storage when it only needs access to 512 bytes. False sharing leads to reduced cache utilization, increased network traffic, and delays while waiting for data to be retrieved.
If the page being shared is frequently used, thrashing can occur and performance will suffer. Thrashing is a behavior characterized by the extensive exchange of data between processors competing for the same data block, which occurs so frequently that it becomes the predominant activity. This will considerably slow down all useful processing in the system. It would therefore be desirable to provide a software-based memory management system which reduces thrashing and false sharing.
It is therefore one object of the present invention to provide improved distributed computing systems.
It is another object of the present invention to provide improved memory management in distributed computing systems.
It is yet another object of the present invention to provide a method, system, and program product for improving memory page sharing in a distributed computing environment.
The foregoing objects are achieved as is now described. The preferred embodiment provides a method, system, and computer program product for reducing false sharing in a distributed computing environment, and in particular to a multi-processor data processing system. A method is proposed to define a virtual address range, within the system memory available to the processors, which will have a finer granularity than the hardware page size. These smaller sections, called xe2x80x9csub-pages,xe2x80x9d allow more efficient memory management. For example, a 64 Kilobyte range may be defined by the memory management software to have a 512 byte granularity rather than 4 Kilobytes, with each 512-byte sub-page capable of being separately managed.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.