1. Technical Field
The present invention generally relates to computer memory and in particular to computer memory addressing techniques. Still more particularly, the present invention relates to a system and method for improved memory address translations with protection checking.
2. Description of the Related Art
In today""s computer systems, the system memory is managed by the operating system, and is allocated to different software applications as needed. Virtual memory is a technique by which a relatively smaller amount of physical memory can be made to seem larger and shareable among many processes. Each software application therefore deals with xe2x80x9ceffectivexe2x80x9d addresses in a virtual memory space, which allow the application to read, write, and execute when required, without ever being concerned with the actual locations, in physical or disk memory, where the operations are taking place. The application relies on the operating system to perform the mapping from the effective address to a physical address.
Address translation is the mechanism by which effective addresses generated by the CPU to access virtual memory are translated into real memory addresses. Address translation is a complex procedure that, if not implemented well, can end up on the critical path determining the clock cycle of the processor. This is true for all architectures, and more so for architectures requiring a two-level process to translate addresses.
Referring now to FIG. 4, a flowchart of a conventional two-level address translation process is shown. To translate an address using two-level address transaction, an effective address must be first translated into an interim virtual address using segment information, and then into a physical address using page table information. Both translation phases require checking of protection bits which dictate the types of accesses that are allowed (e.g. read, write, read and write, no-execute).
In this Figure, when the CPU requests an effective address (step 400), the system first checks the Segment Registers (SR) and Segment Lookaside Buffers (SLB) to determine a virtual address corresponding to the effective address (step 410). In doing so, the system must also perform protection checking to be sure that the type of access requested by the process is permitted (step 420). Note that the various caching structures mentioned here are described more fully below. This process produces a virtual address, if protection checking passes, else an error is returned.
Next, the virtual address is used to access the Translation Lookaside Buffer to determine the correct page table and physical address corresponding to the virtual address (and thereby to the effective address) (step 430). Once again, the system must also perform protection checking to be sure that the type of access requested by the process is permitted (step 440). Finally, if protection checking passes, a valid physical address is returned to the CPU process (step 450).
With reference now to FIG. 5, an example of the use of page tables for address translation is shown. In this figure, both virtual memory 515/525 and physical memory 500 are divided up into multiple xe2x80x9cpages,xe2x80x9d each of which are typically the same size. Each of these pages is given a unique Page Frame Number (PFN indicating a Page Frame Number in physical memory, VPFN indicating a Page Frame Number in virtual memory). For every instruction in a program, e.g., to load a register with the contents of a location in memory, the CPU performs a mapping from a virtual address to a physical one. Also, if the instruction itself references memory then a translation is performed for that reference.
The address translation between virtual and physical memory is done by the CPU using page tables which contain all the information that the CPU needs. Typically, there is a page table for every process in the system. FIG. 5 shows a simple mapping between virtual addresses and physical addresses using page tables for Process X 525 and Process Y 515.
In this example, Process X""s virtual PFN 0, shown in the virtual memory 525 for Process X, is mapped into memory in physical PFN 1, using Process X""s page tables 520. Process Y""s virtual PFN 1, shown in the virtual memory 515 for process X, is mapped into physical PFN 4, using Process Y""s page tables 510.
Each entry in the page tables 510/520 contains the following information:
The virtual PFN,
The physical PFN that it maps to, and
Protection (access control) information for that page.
To translate a virtual address into a physical one, the CPU must first work out the addresses virtual PFN and the offset within that virtual page. The CPU then searches the process""s page tables for an entry which matches the virtual PFN. This gives us the physical PFN for which we are looking.
The CPU then takes that physical PFN and multiplies it by the page size to get the address of the base of that page in physical memory. Finally, the CPU adds in the offset to the instruction or data that it needs.
By mapping virtual to physical addresses this way, the virtual memory 515/525 can be mapped into the system""s physical pages 500 in any order. For example, in FIG. 5, Process X""s VPFN 0 is mapped to physical PFN 1 whereas VPFN 7 is mapped to physical PFN 0 even though it is higher in virtual memory than virtual PFN 0. Therefore, the pages of virtual memory do not have to be present in physical memory in any particular order.
Unlike typical user-level programs, Database and Transaction Processing applications that run in Server and PC Server environments have large memory requirements. Database workloads touch a large number of distinct pages in memory, which places high demands on the address translation mechanism to access them.
To enhance system performance, particularly in relation to memory, several different types of memory caches may be used. These include a page cache, which is used to speed up access to images and data in a virtual memory which is stored on a disk. As pages are read into memory from disk they are cached in the page cache. If they were discarded and then needed again they can quickly be fetched from this cache.
Pages may contain data buffers being used by the kernel, device drivers and so on. The buffer cache is a look aside list of buffers. If, for example, a device driver needs a 256 byte buffer, it is quicker to take a buffer from the buffer cache than to allocate a physical page and then break it up into 256 byte buffers.
When a system utilizes a disk-based virtual memory, this memory is generally stored in a xe2x80x9cswap file.xe2x80x9d To save time in storing data in a swap file (which is much slower than RAM), many systems use a xe2x80x9cswap cache,xe2x80x9d, so that only written (or dirty) pages are saved in the swap file. So long as these pages are not modified after they have been written to the swap file, then the next time the page is swapped out there is no need to write it to the swap file as the page is already in the swap file. Instead the page can simply be discarded. In a heavily swapping system this saves many unnecessary and costly disk operations.
One commonly implemented hardware cache is in the CPU; a cache of Page Table Entries. In this case, the CPU does not read the page table directly but instead caches translations for pages is it needs them. These are the Translation Look-aside Buffers and contain copies of the information kept in the operating system""s page table. When the reference to the virtual address is made, the CPU will attempt to find a matching TLB entry. If it finds one, it can directly translate the virtual address into a physical one and perform the correct operation on the data. If the CPU cannot find a matching TLB entry then it must get the operating system to help. It does this by raising an exception. In essence this means signaling the operating system that a TLB miss has occurred.
Other performance-enhancing hardware features include Segment Registers (SRs), as well as small, fast caches called Segment Lookaside Buffers (SLBs). SRs and SLBs hold recently used segment table entries and are searched to provide necessary information for the effective-to-virtual address translation process.
These mechanisms enhance performance of the address translation process. The drawback of using caches, hardware or otherwise, is that the system must use more time and space maintaining these caches, and if the caches become corrupted, then the system will crash. Furthermore, these type of caches do not eliminate the need for protection checking.
Protection checking refers to the process of verifying that the requested type of memory access is permitted. Each entry may be marked, for example, xe2x80x9cno executexe2x80x9d or xe2x80x9cread-only.xe2x80x9d The operating system must check memory access against the protection information for that address to be sure that the memory access (whether read, write, or execute) may be performed. Because each access must be checked, any amount of traditional caching can only provide a limited improvement in memory access time, as the protection checking becomes a bottleneck. There is therefore a need for an efficient means for virtual memory access and address translation with an improved system for protection checking.
It is therefore one object of the present invention to provide an improved computer memory.
It is another object of the present invention to provide an improved computer memory addressing system.
It is yet another object of the present invention to provide an improved system and method for memory address translations with protection checking.
The foregoing objects are achieved as is now described.
A memory translation system is provided which includes an xe2x80x9caddress cachexe2x80x9d (addrcache). This address cache contains translation information of recently referenced addresses, and is accessed before the conventional two-level address translation process. If a xe2x80x9chitxe2x80x9d is made in the address cache, which does not require protection checking, the conventional address translation process is bypassed. The address cache stores its memory addresses according to the protection status of each address, so that protection checking is not performed as a separate step.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.