The present invention relates generally to the field of testing computer system components, and more specifically, to a system and method for testing memory in a computer system while an operating system is active.
Conventional schemes for performing diagnostic tests on computer memory systems are well known. The importance of testing a computer""s memory cannot be overemphasized. Costs and defects associated with memory defects are relatively high due to user downtime and loss of information. Many software applications are memory intensive and it is important to ensure that defective memory which impacts the performance of such software applications are detected and removed.
While most software applications are stored on the hard disk, an application must be loaded into memory first in order to execute the application. Initially, the operating system assigns a memory block (e.g. 64 kb) for the application. The operating system then copies the application from the hard disk to the allocated memory block. The application is thereafter executed within the allocated memory. While the application is active, it may require more memory blocks, which it will allocate from the operating system by calling a memory allocation routine that is part of the operating system; and from time to time the application releases the memory blocks thus allocated using a memory deallocation routine that is part of the operating system. Thereafter, upon completion of the application, the operating system relinquishes the memory block that it initially allocated for the application, and any memory blocks that the application might have allocated while it was active, for use by other applications. This allocation/deallocation of memory is constantly occurring within the computer system.
Operating systems typically allocate memory at two levels. At a lower level the operating system allocates physical memory, while virtual memory is allocated at a higher level. Further, in today""s faster and efficient processors, operating systems are capable of multitasking and can execute multiple programs simultaneously. This results in increased memory requirements which are typically more than the available amount of physical memory. Consequently, virtual memory is employed to ensure that adequate memory is available. Virtual memory is a memory allocation scheme in which a computer with a lesser amount of physical memory appears as if it had a much higher amount of memory. Typically, when applications allocate memory, the operating system automatically allocates virtual memory. When an application starts to read or write to virtual memory, the operating system and the CPU detect the read or write operation and automatically verify if physical memory has been allocated to the virtual memory address that the application is accessing, and if not, the operating system allocates physical memory to the virtual memory address that is being accessed. Although an application may be allocated a large amount of virtual memory with a large number of virtual memory addresses, those addresses are mapped to physical memory only as needed. Allocation of physical memory is done in an allocation unit known as pages. The size of a physical memory page can vary with the capabilities of the CPU. In this manner, the memory allocation/deallocation routine of the operating system ensures that large memory requirements for software applications are met.
As noted, conventional schemes for performing diagnostic tests on memory in a computer system are well known. One example of such a scheme is a POST (power on self test) computer program embedded in system memory. Because POST executes every time a computer is booted on, there is a desire to minimize the time that a user has to spend waiting for the computer to boot. Therefore, POST runs quick diagnostic tests on the computer memory as well as other system components. Rather than limit testing to system memory, POST typically tests all of the components within the computer system. Moreover, the sophistication of POST is limited in order to effectively reduce the booting period when the computer system is turned on. In any event, POST is limited to the booting process and failures occurring while the operating system is active are not detected.
Other schemes include various diagnostic programs typically stored on media such as the computer""s hard disk drive or floppy disk. Such diagnostic programs are commercially available for purchase by users, and are employed to detect faults related to computer components, such as memory, video, optical storage, hard disk drive, serial ports and virtual memory. In some instances, the user can select which components on which diagnostic programs should be performed. Typically, diagnostics programs test memory by writing specific data patterns to memory and then reading back these patterns for verification. That is, a deviation from the expected data pattern indicates the portion of memory as being defective.
Disadvantageously, if processes are running which occupy portions of system memory, many diagnostic programs cannot test the occupied portions. Attempts to access these portions of memory will result in a system crash. Further, diagnostic programs are typically complicated and cannot be run by a computer novice.
Another disadvantage relates to the length and complexity of diagnostic programs which have significant processor performance and memory requirements. In addition, many diagnostic programs will have a significant impact on system performance, and thus it is not advantageous to run these diagnostic programs in anticipation of failures, but rather only after it appears that a failure has occurred and only the exact cause of the failure remains to be determined.
Further, the accuracy of the results of some diagnostic programs are somewhat doubtful because the diagnostic programs when executed do not simulate the full range of operating environments in which the computers are employed. A further disadvantage of such diagnostic programs is that memory failures are not automatically detected. In fact, by the time a system failure occurs, data loss has already taken place and it is almost always too late to manually execute a diagnostic program to prevent loss of valuable information.
Moreover, computers traditionally rely on hardware components such as parity error checking and Error Checking and Correcting (ECC) mechanisms to monitor system memory for errors while the system is running. These solutions increase system cost, as they require extra hardware components, and extra memory to store error detection information.
Therefore, it would be desireable to provide a system and method which is capable of resolving the aforementioned problems relating to the conventional approaches for performing diagnostic tests on system memory.
The present invention generally integrates memory testing at the operating system level to make memory testing a continuous background task. In a first embodiment, memory is tested when it gets deallocated by placing a test pattern in the memory. The accuracy of the test pattern is verified when the memory is required by another software application. In an alternate embodiment, the present invention discloses a method using a test pattern for testing a memory page of a computer system while an operating system is active.
The method comprises the steps of determining a function for allocating the memory page used by the operating system, and hooking the function for allocating the memory page. Upon receiving a request to allocate the memory page, the method involves determining whether the memory page has the test pattern; and if the memory page has the test pattern, verifying the test pattern is correct to ensure the memory page is not defective. If the test pattern is correct, the memory page is allocated as requested, and if the test pattern is incorrect, the memory page is removed from service.
In a further embodiment, the present invention teaches a method using a test pattern for testing a memory page while an operating system is active. The method comprises the steps of determining a deallocation scheme for deallocating the memory page by the operating system, hooking the deallocation scheme for deallocating the memory page; and upon receiving a request to deallocate the memory page, storing the test pattern in the memory page.
In addition, one aspect of the present invention teaches a system for testing a memory page of a computer while an operating system is active. The system includes software code having, one or more software instructions that takes the place of a memory allocation/release scheme of the operating system, and having one or more software instructions for storing a test pattern in the memory page upon receiving a request to release the memory page.
In a further aspect, the software code includes one or more software instructions for determining whether the memory page has the test pattern upon receiving a request to allocate the memory page, and one or more software instructions for verifying the test pattern is correct to ensure the memory page is not defective. If the test pattern is correct, the code includes one or more software instructions for allocating the memory page as requested, and if the test pattern is incorrect, for removing the memory page from service.
A further embodiment of the present invention discloses a system for testing a memory page of a computer system while an operating system is active. The system includes a means for determining a deallocation scheme for deallocating the memory page by the operating system; a means for hooking the deallocation scheme for deallocating the memory page; and upon receiving a request to deallocate the memory page, a means for storing the test pattern in the memory page.
In an alternate aspect, the system includes a means for determining a function used by the operating system for allocating the memory page upon receiving a request to allocate the memory page; a means for hooking the function for allocating the memory page; and a means for verifying the test pattern is correct to ensure the memory page is not defective, if the test pattern is correct, means for allocating the memory page as requested, if the test pattern is incorrect, means for removing the memory page from service.
Advantageously, unlike the related art, the present invention makes memory testing a continuous background task, automatically and accurately detects memory failure and ultimately reduces user down time.