The invention relates to providing multiple address translations, for example for generating a memory image.
A particular application of the invention is in the context of fault tolerant computing systems. The invention is, however, not limited in its application to such systems.
The generation of a memory image is needed, for example, where it is necessary to establish an equivalent memory image in a fault tolerant computer system such as a lockstep fault tolerant computer that uses multiple subsystems that run identically. In such lockstep fault tolerant computer systems, the outputs of the subsystems are compared within the computer and, if the outputs differ, some exceptional repair action is taken. That action can include, for example, reinstating the memory image of one subsystem to correspond to that of another subsystem.
U.S. Pat. No. 5,953,742 describes a fault tolerant computer system that includes a plurality of synchronous processing sets operating in lockstep. Each processing set comprises one or more processors and memory. The computer system includes a fault detector for detecting a fault event and for generating a fault signal. When a lockstep fault occurs, state is captured, diagnosis is carried out and the faulty processing set is identified and taken offline. When the processing set is replaced a Processor Re-Integration Process (PRI) is performed, the main component of which is copying the memory from the working processing set to the replacement for the faulty one.
International patent application WO 99/66402 relates to a bridge for a fault tolerant computer system that includes multiple processing sets. The bridge monitors the operation of the processing sets and is responsive to a loss of lockstep between the processing sets to enter an error mode. It is operable, following a lockstep error, to attempt reintegration of the memory of the processing sets with the aim of restarting a lockstep operating mode. As part of the mechanism for attempting reintegration, the bridge includes a dirty RAM for identifying memory pages that are dirty and need to be copied in order to reestablish a common state for the memories of the processing sets.
In these prior systems, the control of the reintegration is controlled by software. However, as the systems grow in size, the memory reintegration process becomes more time consuming.
Accordingly, an aim of the present invention is to enable the generation of a memory image in a more efficient manner.
Particular aspects of the invention are set out in the accompanying independent and dependent claims.
One aspect of the invention provides a computer system comprising memory and at least a first processor that includes a memory management unit. The memory management unit includes a translation table having a plurality of translation table entries for translating processor addresses to memory addresses with at least one translation table entry providing at least a first memory address translation and a second, different, memory address translation for a processor address.
The provision of a memory management unit providing multiple translations for a given processor generated address (hereinafter processor address) means that the memory management unit can identify multiple storage locations that are relevant to a given processor address. The processor can, for example, be provided with instructions defining, for example, the copying of a memory location to another location without needing separately to specify that other location. Preferably, all of the translation entries can provide first and second translations for a processor address.
The memory management unit can be operable selectively to enable either the first translation or the second translation to be used in response to the processor address, to provide selectable addressing of different parts of memory. Also, the memory management unit can be operable selectively to enable both the first translation and the second translation to be used in response to the processor address to provide simultaneous addressing of different parts of memory. The memory management unit can include status storage, for example one or more bits in a translation table entry that contain one or more indicators as to which of the first translation and the second translations are to be used in response to a processor address for a given address and/or instruction generating principal (e.g., a user, a program, etc.).
One application of the invention to high reliability computers, the first translation addresses a first memory and the second translation addresses a second memory, separate from the first memory. These could be in the form of a main memory and a backup memory, the backup memory being provided as a reserve in case a fault develops in the main memory.
In one embodiment of the invention, the first memory is a memory local to the first processor and the second memory is a memory local to a second processor interconnected to the first processor via an interconnection. In a particular embodiment of the invention, the interconnection is a bridge that interconnects an IO bus of the first processor to an IO bus of the second processor
In such an embodiment, during a reintegration process following a loss of lockstep, the memory management unit of a primary processor can be operable in response to a replication instruction at a processor address to read from a first memory location in the first memory identified by the first translation for the processor address and to write to a second memory location in the second memory identified by the second translation for the processor address.
Alternatively, the memory management unit can be operable in response to a replication instruction at a processor address to read from a first memory location in the first memory identified by the first translation for the processor address and to write to the first memory identified by the first translation and to a second memory location in the second memory identified by the second translation for the processor address.
In another embodiment, the first memory is a main memory for the first processor and the second memory is a backup memory. The second memory can be a memory local to a second processor interconnected to the first processor via an interconnection. For example the interconnection could be a bridge that interconnects an IO bus of the first processor to an IO bus of the second processor, or could be a separate bus.
The memory management unit can be operable in response to a read instruction at a processor address to read from a first memory location in the first memory identified by the first translation for the processor address and is operable in response to a write instruction at a processor address to write to a first memory location in the first memory identified by the first translation and to a second memory location in the second memory identified by the second translation for the processor address.
In one embodiment, the computer system comprises a plurality of processors interconnected by an interconnection, each processor comprising a respective memory management unit that includes a translation table having translation table entries providing at least first and second memory translations for a locally generated processor address each first memory translation relating to the memory local to the processor to which the memory management unit belongs, and each second memory translation relates to the memory local to the another processor. In a fault tolerant computer, the plurality of processors can arranged to be operable in lockstep.
An embodiment of the invention can also include an IO memory management unit, the IO memory management unit including a translation table with a plurality of IO translation table entries for translating IO addresses to memory addresses, wherein at least one IO translation table entry provides at least a first memory address translation and a second, different, memory address translation for an IO address.
Another aspect of the invention provides a method of generating an image of locations in a memory of a computer. The method includes: a first processor generating predetermined instructions identifying processor addresses; and a memory management unit responding to a said processor address by providing, from a translation table entry for the processor address, a first translation of the processor address for a read from a first memory location; reading of the content of the first memory location; the memory management unit further providing from said translation table entry for said processor address, a second translation of the processor address for a write of the content the first memory location to a second memory location; and writing of the content of the first memory location to a second memory location.