The central subsystem of a computer generally comprises three types of units: processors, memory modules forming the main memory, and input-output controllers. Usually the processors communicate with the memory modules through a bus that allows addressing and transfer of data between the processors and the main memory. To execute a program instruction, its operands must be located in the main memory. The same is true for successive instructions of the program to be executed. In the case of a multiprocessing system, the memory must be partitioned to allow multiplexing between programs. For this purpose, virtual addressing is generally used in conjunction with a pagination mechanism which includes dividing the addressable space, or "virtual space," into zones of a fixed size called "pages." In a system of this kind, a program being executed addresses a virtual space that corresponds to a real portion of the main memory. Thus, a logical or virtual address is associated with a physical or real address of the main memory.
An instruction that requires addressing contains information that enables the processor executing it to obtain a real address. In general, this virtual address is segmented, i.e., it is composed of a segment number, a page number, and a shift within the referenced page. The segment number in turn can be subdivided into a segment table number and a shift within this table.
In order to access an item of information in memory associated with this segmented address, several memory accesses are necessary. It is first necessary to access an address space table allocated to the process (program being executed) From this table, using a segment table number, the real address of the corresponding segment table is then obtained. Next, as a function of the shift in the segment table, a segment descriptor is accessed which makes it possible to calculate the real address of a page table. Eventually a real page address is found, using a function of the page number defining the shift within this page table, thereby making it possible to address the memory. The real address of a word or particular octet is obtained by concatenation of the real page address with the shift within this page defined by the least significant bits of the virtual address.
A memory access is relatively time consuming, due primarily to the use of a bus that is common to both the processors and the memory module. To improve system performance, the successive memory accesses typically required to obtain a real address are avoided as much as possible. Most processes have a locality property according to which, during a given phase of its execution, the number of pages used by a process is very small relative to the total number of pages allocated to that process.
The locality property can be used to facilitate the translation of a virtual address into a real address by the use of "extracts." Each extract includes a virtual address and an associated real address, and is used by the program during a single execution phase. A plurality of extracts are stored in high-speed memory or in the registers. To perform a translation of a virtual address into a real address, a high-speed associative memory is accessed to determine whether the virtual address to be translated is already present in the high-speed memory. If it is, the real address is obtained directly without accessing the main memory.
The locality property motivates the use of cache memories composed of small, high-speed memories in which the pages most recently referenced are permanently stored. The probability that a new reference will relate to an item of information already present in cache memory is high, so the effective access time is reduced. In a manner analogous to the translation of the virtual address into a real address, a cache memory comprises a table containing the real addresses of the pages present in the cache memory. This table, called a directory, can be consulted in an associative fashion to determine whether the information associated with a given real address is contained in the cache memory. If it is, a word or an octet is obtained by addressing the cache memory by means of the least significant bits of the virtual address of the word or octet.
Issues related to the translation of addresses will now be discussed, it being understood that the same considerations may apply to cache memories. In both cases, the issue is that of rapidly obtaining information associated with a page address. In the case of translation of an address, the page address is a virtual address and the associated information is the corresponding real address, while in the case of cache memory, the page address is a real address and the associated information is composed of all the data contained in the page.
As previously discussed, the high-speed translation memory is an associative memory. The memory comprising a given number of registers, or more generally, locations, each capable of storing one extract. Each extract can be accompanied by additional information such as right-of-access indicators or indicators reporting that a write access has been effected in the page associated with the extract. Moreover, each extract is associated with a presence indicator which, for a given logical value, indicates that the associated extract is valid. These presence indicators are, for example, set to zero at initialization, i.e., each time a process is activated in a particular processor. Thus, as the process uses new pages, the associated extracts are loaded into associative memory and the respective presence indicators are simultaneously set to 1. When a memory access must be executed, the virtual address is compared with the virtual address of each extract stored in associative memory. If there is a match between the virtual address being sought and one of the virtual addresses of an extract stored in the memory while its associated presence indicator is set to 1, the corresponding real address can be obtained directly by simply reading the register that contains the real address.
In order for this translation mechanism to be practical, the associative memory must be of limited size. Consequently, for processes with many pages, the associative memory can not store all the extracts corresponding to these pages. When associative memory is full, the only way to store a new extract therein is to erase an old extract. It is therefore necessary to provide a method for eradicating an old extract and storing a new extract in its place. To accomplish this, a replacement algorithm is used that decides which old extract is to be replaced by a new extract. Many algorithms have already been proposed, as for example:
the FIFO ("first in, first out") algorithm, in which the oldest algorithm is replaced;
the RAND ("random choice") algorithm, in which the extract to be replaced is chosen at random;
the LFU algorithm ("least frequently used"), in which the least frequently used extract is replaced; and
the LRU ("least recently used") algorithm, in which the least recently used extract is replaced. The LRU algorithm theoretically gives good results, but in practice it is preferable to use a simplified version, called the "pseudo-LRU." To generate n extracts, a true LRU requires the presence and management of log.sub.2 (n) bits per extract to maintain an ordered history of the uses of the extracts. On the other hand, a pseudo-LRU requires only a single bit per extract, called a reference bit or an indicator bit.
According to the pseudo-LRU algorithm, the reference bit is set to a first logical value (1 for example) when its associated extract is used. When the associative memory is full, all the presence indicators are set to 1, and a new extract must be loaded. The extract to be replaced is the first extract encountered with a reference bit set to 0 according to the chronological order of filling. When saturation is reached, i.e., when all but one of the reference bits are set at 1, all the reference bits are reset to 0, and the extract whose reference bit is at 0 is replaced by the new extract. Resetting all the reference bits obliterates the history of the use of the extracts.
In view of the phenomenon of locality and the large number of memory accesses which occur during the execution of a program, the performance of a computer system depends in large measure on the speed with which its associative memory functions. It is therefore advantageous to optimize the associative read circuits, as well as the circuits that manage the presence and reference indicators, so that their updating does not slow down operation of the system.