The invention relates to an apparatus and method for memory modification tracking.
The invention finds particular, but not exclusive, application to fault tolerant computer systems such as lockstep fault tolerant computers which use multiple subsystems that run identically.
In such lockstep fault tolerant computer systems, the outputs of the subsystems are compared within the computer and, if the outputs differ, some exceptional repair action is taken.
U.S. Pat. No. 5,953,742 describes a fault tolerant computer system that includes a plurality of synchronous processing sets operating in lockstep. Each processing set comprises one or more processors and memory. The computer system includes a fault detector for detecting a fault event and for generating a fault signal. When a lockstep fault occurs, state is captured, diagnosis is carried out and the faulty processing set is identified and taken offline. When the processing set is replaced a Processor Re-Integration Process (PRI) is performed, the main component of which is copying the memory from the working processing set to the replacement for the faulty one. A special memory unit is provided that is used to indicate the pages of memory in the processing sets that have been written to (i.e. dirtied) and is known as a xe2x80x98dirty memoryxe2x80x99, or xe2x80x98dirty RAMxe2x80x99. (Although the term xe2x80x9cdirty RAMxe2x80x9d is used in this document, and such a memory is typically implemented using Random Access Memory (RAM), it should be noted that any other type of writable storage technology could be used.) Software accesses the dirty RAM to check which pages are dirty, and can write to it directly to change the status of a page to dirty or clean. Hardware automatically changes to xe2x80x98dirtyxe2x80x99 the state of the record for any page of main memory that is written to. The PRI process consists of two parts: a stealthy part and a final part. During Stealthy PRI the working CPUset is still running the operating system, the whole of memory is copied once and whilst this is going on, the dirty RAM is used to record which pages are written to (dirtied). Subsequent iterations only copy those pages, which have been dirtied during the previous pass.
International patent application WO 99/66402 relates to a bridge for a fault tolerant computer system that includes multiple processing sets. The bridge monitors the operation of the processing sets and is responsive to a loss of lockstep between the processing sets to enter an error mode. It is operable, following a lockstep error, to attempt reintegration of the memory of the processing sets with the aim of restarting a lockstep operating mode. As part of the mechanism for attempting reintegration, the bridge includes a dirty RAM for identifying memory pages that are dirty and need to be copied in order to reestablish a common state for the memories of the processing sets.
In the previously proposed systems, the dirty RAM comprises a bit map having a dirty bit for each block, or page, of memory. However, with a trend to increasing size of main memory and a desire to track dirtied areas of memory to a finer granuality (e.g. 1 KB) to minimise the amount of memory that needs to be copied, the size of the dirty RAM needed to track memory modifications is increasing. There is a continuing trend to increase memory size. For example main memories in the processing sets of a systems of the type described above have typically been of the order of 8 GB, but are tending to increase to 32 GB or more, for example to 128 GB and beyond. At the same time, as mentioned above, there is a desire to reduce the granularity of dirtied regions to less than the typical 8 KB page size (e.g., to 1 KB). This is to minimise the copy bandwidth required to integrate a new CPUset.
With the increasing size of main memory and/or the reduced page sizes, the number of bits, and consequently the size of the dirty RAM that is needed to track memory changes can become large. As a result of this, the time needed to search the dirty RAM to identify pages that may have been modified and will need to be re-copied, can increase to a point that it impacts on the time taken to re-integrate the main memory in the processing sets. Another problem that can occur is increased risk of errors in the dirty RAM.
Accordingly, an aim of the present invention is to provide a more efficient approach to memory modification tracking.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims.
In one aspect, the invention provides dirty memory control logic for a computer system. The dirty memory includes dirty indicators settable to indicate dirtied blocks of memory. The control logic is operable to interrogate the dirty memory to determine dirty indicators that are set and is operable to output an indication of the dirty indicators that are set. By providing hardware logic that is operable automatically to interrogate the dirty RAM to identify set dirty indicators, rapid and reliable interrogation of the dirty memory is possible.
Preferably a dirty indicator is implemented by one or more bits. The block of memory can be a page of memory. The control logic can advantageously be operable to interrogate the dirty memory word-by-word to determine words including a word with a set bit. A comparator can be provided for comparing bits of a word to a predetermined value to determine where a dirty indicator is set. The comparison could be performed serially for bits within a word, but it is advantageously done in parallel for the bits of the word. For example, by using associative memory, the interrogation of the dirty memory could be effected associatively in parallel to determine words including a word with a set bit.
Address logic can be provided for determining a memory page address corresponding to a set dirty indicator. This can be effected by computing the page address given a known dirty bit location and a known mapping between the dirty memory and main memory.
The control logic can include a base register identifying a base address for the dirty memory and a word offset register for identifying a word within the dirty memory. In this manner the dirty bit location within the dirty RAM can be identified.
In a particular implementation, the dirty memory includes a lower level that includes groups of dirty indicators, each dirty indicator being settable to a given state indicative that a page of memory associated therewith has been dirtied, and at least one higher level that includes dirty group indicators settable to a predetermined state indicative that a group of the lower level associated therewith has at least one dirty indicator in a state indicative that a page of memory associated therewith has been dirtied. The control logic is divided into higher level control logic operable to interrogate the higher level memory and lower level control logic operable to interrogate the lower level memory. These two level of logic can operate in parallel, enhancing the efficiency of the logic. More than two levels could be provided.
Where reference is made to a predetermined state, this will typically be the same for each of the levels (e.g., a 1 or a 0) to simplify the logic, but alternatively different states may apply in different levels.
In one example, a group of indicators has a length of one word. In a particular example, the highest level dirty memory has a length of one word.
The higher level control logic can be operable to identify a dirty group indicator that is set and to supply the lower level control logic with an indication of a group of dirty indicators for which a dirty group indicator is set. The indication can comprise a word offset value identifying a word in the higher level dirty memory and a bit offset value indicating a bit within that word at which a set dirty group indicator was identified.
Another aspect of the invention provides a computer system comprising a dirty memory as defined above with at least one processing set that includes main memory. The computer system may be a fault tolerant computer system and include a plurality of processing sets that each includes main memory and a dirty memory. The processing sets can be configured normally to operate in lockstep, wherein the computer system includes logic operable to attempt to reinstate an equivalent memory state in the main memory of each of the processor following a lockstep error.
In another aspect, the invention provides a method of managing reinstatement of an equivalent memory state in the main memory of a plurality of processing sets of a fault tolerant computer following a lock step error. The method includes the performance of at least one cycle of copying any page of memory that has been dirtied from a first processing set to each other processing set, each cycle including: interrogating a dirty memory comprising dirty indicators settable to indicate dirtied pages of memory by means of control logic operable automatically to interrogate the dirty memory to determine dirty indicators that are set.
In this method direct memory access to main memory can be permitted during at least one cycle of copying any page of memory that has been dirtied from a first processing set to each other processing set. It is advantageous to permit accesses to continue as this means that the system can remain responsive during the reinstatement, although this will cause further memory pages to be dirtied. However, as the cycles of reinstatement get faster and faster (with, hopefully, less pages to be copied on each pass) the number of pages dirtied should, hopefully, reduce on each cycle.
As they may still be some pages still dirtied after a number of passes, a time can be reached where the system is quiesses to prevent any further dirtying to permit a final cycle of copying any page of memory that has been dirtied from a first processing set to each other processing set to be performed.
At this time, direct memory access by I/O devices is inhibited and the operating system is effectively suspended, again to prevent memory being updated further. However, a DMA operation to copy state from one processing set to the other remains operable to copy the remaining dirty pages.