1. Field of the Invention
The present invention relates generally to computer memory subsystems and more particularly to such a memory subsystem organized into what is known in the art as a virtual memory. Still more particularly, the invention relates to an apparatus for converting virtual addresses into real memory addresses and for effecting certain unique control functions within the memory hierarchy.
In most modern computer .[.system.]. .Iadd.systems.Iaddend., when a program is executing, it frequently attempts to access data or code which resides somewhere in the system (that is, in some level of the cache/main store/Direct Access Storage Device (DASD) storage hierarchy or even at another node in a distributed system network). For the most primitive system, consider what the program must understand in order to make this access.
Where is the data (or code)? The location will generally determine what kind of address must be used for the access (e.g. main storage address of 24 bits, or sector address on a disk track, or node address in a network). The location will also determine what kinds of instructions must be used to accomplish the access (e.g. Load/Store/Branch for main storage accesses, channel command words for disk accesses, communication protocols for network accesses).
Is this data shared with our program executions? If it is, the access cannot proceed unless certain locks are held. If the changes which this program is about to make to data are not to be seen by others at this time, the Store instruction must be to some private address.
Is this data to be recoverable? If it is, some "journalling" strategy must be implemented so that a consistent prior state of the data can be retrieved when necessary.
Suppose, in this very primitive system, the program was in fact required to make these distinctions at each access. Then the following would result:
If the program is to be generally applicable the accesses would be very slow, even for the most frequent occurrences of "trivial, safe" requests, namely, for private, unrecoverable data in main storage.
If the program were to perform well it would be locked into one accessing mode, so that it would not run correctly against data with different characteristics.
The program would be complex, large and prone to error.
Modern systems have addressed these problems in varying degrees. For instance:
Relocate architectures generally allow private, unrecoverable, nonpersistent data and programs to be addressed uniformly, with an address size of 16 to 32 bits--(usually adequate for temporary computational requirements). When these architectures are implemented with proper "look-aside" hardware, the vast majority of such accesses are accomplished at cache or main storage speeds. Only when this look-aside hardware fails (less than one in one hundred attempts) does the system pay the cost of accessing the relocation table structure. And only when the relocation tables fail (i.e. the data is not in main storage) does the system pay the significant "page fault" overhead. Thus the penalties are paid only when they are really necessary, which is surely the goal of a good architecture and implementation.
When the data is to persist beyond this execution of this program, most modern systems require that, instead of Load/Store/Branch instructions, access be made by explicit requests to software-implemented "access methods." These access methods generally support data which are organized into certain defined aggregates, called "records" and files." The "instruction" to access are usually called "read/write" or "get/put."
This data is not shared or recoverable. It may in fact be in main storage (in some buffer area). But for every access, the program must pay the overhead of these explicit "read/write" calls. Thus access methods, when suitably defined, have resulted in programs which are less complex and more generally usable than in primitive systems, but the performance of these accesses are uniformly poorer than Load/Store, and the data accessed must have been structured into the appropriate aggregate type.
When the data is to be shared or recovered, most modern systems require that explicit requests be made to software-implemented "data-base subsystems." These accesses are generally much slower than those for access methods, not only because of the additional functions of lock and journal management, but also because the kinds of aggregates which these subsystems support (e.g. relations, hierarchies) are themselves more complex.
Again, the data may in fact be more simply structured and in a buffer in main storage, but the overhead must be paid on every access request.
Some systems support the recovery of non-persistent data with a facility called "checkpointing." Now the programmer who wishes to write a recoverable application must deal with three different facilities--checkpointing for computational data, explicit backup for files, and "commit" instructions for data base.
The IBM System/38 has gone farther than most systems in providing at least a uniform addressing structure for all data. But it has done this at the cost of making all addresses very large, many accesses very slow, much storage and hardware required to implement the architecture, and has not yet provided a uniform approach to sharing or recovery.
Various techniques are known in the art whereby a number of computer programs, whether executed by a single essential processing unit or by a plurality of such .[.a.]. processing units, share a single memory. The memory being shared by programs in this manner requires an extremely large parent storage capacity, which capacity is often much larger than the actual capacity of the memory. If, for example, a system employs a 32-bit addressing scheme, 2.sup.32 addressable bytes of virtual storage are available. This virtual storage space is conventionally thought of as being divided into a predetermined number of areas or segments each of which is in turn divided into pages with each page consisting of a predetermined number of lines each in turn having a predetermined number of bytes. Thus segment and page designations or addresses assigned to virtual storage are arbitrary programming designations and are not actual locations in main storage. Therefore, virtual segments and pages are usually randomly located throughout main storage and swap in and out of main storage from backing stores as they are needed.
The random location of segments and pages in main storage necessitates the translation of virtual addresses to actual or real addresses using a set of address translation tables that are located in main storage conventionally referred to as page frame tables. In a large virtual system a great many such address translation tables are employed. These may be organized in a number of different ways. The essential feature of any such organization is that the particular virtual address must logically map to a memory location in said tables which will contain the real address for said virtual address (if one exists).
Functionally, the operation of such address conversion tables is as follows: the high order bits of the particular virtual address are used to access a specific section of said translation tables, which relate to a particular frame or segment, where upon a search is then performed on the lower bits to see if a particular virtual address is contained therein and, if so, what real address is associated therewith. Each page table pointed to by a virtual frame address contains the real locations of all of the pages in one of the frames. Therefore if a particular frame is divided into for example, 16 pages there would be 16 page tables, for each frame, and a separate frame table which would have the entries pointing to a particular set of individual page tables. It should be understood that the above description is generalized in nature and that there are many different ways of organizing the address conversion utilizing the page tables, as well as the means for addressing same, starting with the CPU produced virtual address. In the subsequent description of the preferred form of the invention as set forth and disclosed in the embodiment there will be a detailed description of the hash address tables (HAT) and the inverted page tables (IPT) which, in essence, are functionally organized as set forth above.
When making the actual address translation, regardless of the details of the overall system organization and use of the page tables, the proper entry point into the page-frame tables is made and the page tables are accessed using the presented virtual address as the argument and, usually after a plurality of memory accesses, the desired entry in the page tables is found. At this point a check is usually made to determine if all system protocols have been followed and if so, the real address of the requested page in memory is accessed from the page table. The byte portion of the virtual address or "byte offset" is essentially a relative address and is the same in the virtual page as in the real page whereby once the desired real page address portion of the virtual address has been translated, the byte offset portion is concatenated onto the real page address location to provide the real byte address in main storage.
As is well known in current virtual memory systems, in order to avoid having to translate a virtual address each time the memory is accessed, current translations of recently used virtual addresses to real addresses are retained in a special set of rapidly accessible tables or high speed memories referred to as Directory Look-Aside Tables (DLAT) or Translation Look-Aside Buffers (TLBs) as used in the present invention. These tables or buffers are conventionally special high speed or rapidly accessible memories which may be accessed much faster than the previously described page frame tables whereby frequently used virtual addresses may be stored in this table and accessed very rapidly with the resultant saving of a great deal of execution time within the computer. The efficiency of such TLB address translation systems is predicated upon the fact that, subsequent to the first access to a particular virtual page, there will be a great many accesses to the same page during a given program execution. As indicated above, even though subsequent accesses are to different lines and bytes within a page, the virtual to real page address translation is the same for that page regardless of which line or byte is being addressed.
The use of the TLBs significantly reduces the number translations that must be made (in the page frame tables) and thus has a considerable effect on the performance of the overall virtual memory system.
Another problem with such prior art relocation systems is handling the problem of journalling. That is, maintaining a copy of data in back up storage while a current program is running and using the data. Thus in the event of some hardware or software failure a valid copy of the original data will still be available. This function has been accomplished in the past by complex and time consuming hardware and software routines to provide the requisite journalling function again at the cost of slowing down memory performance.