The present invention relates to address mapping techniques which may be used with multiple address spaces. More specifically, the invention relates to an arrangement in which a processor is connected to a main memory and can access data from a real address in main memory by generating a virtual address from one of a plurality of address spaces.
In a conventional multiple address space system, a processor cannot operate in more than one address space at a time, by a processor is capable of operating in one of a number of the multiple address spaces. Furthermore, in a multiprocessor, with several processors operating simultaneously, different processors may be operating in different address spaces at one time.
Multiple address spaces are conventionally implemented by providing a separate mapping table for mapping the virtual addresses in each address space into corresponding real addresses. Due to their size, these mapping tables are stored in main memory or secondary memory, so that accessing them is slow. In order to avoid accessing the appropriate mapping table for each virtual address generated by a processor, a translation lookaside buffer containing the most frequently used mapping table entries is located in the processor or its cache memory. Typically, the processor converts a virtual address to a real address, using the translation lookaside buffer or the mapping table if necessary, before feeding the address to its cache, resulting in delay. Additional delay occurs when accessing the mapping table in main or secondary memory, which will occur very frequently whenever a processor changes from operating in one address space to another and less frequently at other times.
U.S. Pat. No. 4,481,573 discusses a virtual address translation unit which is shared by plural processors, one of which is connected to a common bus through a cache. Rather than providing a translation lookaside buffer in each processor, a translation lookaside buffer is provided in the address translation unit shared by all the processors. Therefore, the virtual address is fed directly to the cache, and is only translated when a cache miss occurs, avoiding delays due to translation. This patent does not, however, discuss the use of multiple address spaces.
U.S. Pat. No. 4,264,953 illustrates another approach to virtual address translation, in which each cache responds to virtual addresses in parallel with a mapper which translates the virtual addresses from the cache's multiprogrammed processor. The mapper may translate a virtual address into different real, or physical, addresses when its processor is running different programs, and some of the real addresses access a shared portion of main memory. In order to avoid consistency problems, the caches never store data from the shared portion of memory. To increase efficiency, each cache has a portion dedicated to each operating program so that it is not necessary to reload the entire cache when a processor changes from executing one program to another. Each processor sends both a virtual address and an identifying program number to its mapper in order to produce a physical address appropriate to the program it is running. This mapping of addresses based on which program is running may be thought of as an example of multiple address spaces. In effect, the only address space available to a processor at any given time is the one corresponding to the program it is running. The shared portion of memory is included in every address space, but the mapper recognizes addresses in the shared portion of memory somehow and inhibits the cache from storing data from that portion. This technique thus permits sharing only within a predetermined portion of memory, and precludes the cache storage of shared data.
Goodman, J. R., "Using Cache Memories to Reduce Processor-Memory Traffic", 10th Annual Symposium on Computer Architecture, Trondheim, Norway, (Jun. 1983), discusses early work on what is now known as a "snoopy cache", used with a processor which is connected to the main memory through a bus which supports multiple processors. A snoopy cache may be used to increase system performance where processor-memory bandwidth is severely limited. Goodman recognizes the difficulty of task switching, which requires cache reloading, and suggests using a separate processor for each task. This paper does not deal with the issue of address translation and therefore does not discuss multiple address spaces.
Katz, R. H., Eggers, S. J., Wood, D. A., Perkins, C. L. and Sheldon, R. G., "Implementing a Cache Consistency Protocol", Conference Proceedings: The 12th Annual International Symposium on Computer Architecture, IEEE Computer Society Press, Piscataway, N.J., 1985, pp. 276-283, discuss a cache consistency protocol for use in a shared memory multiprocessor system including snoopy caches, but similarly does not deal with the issues of address translation and multiple address spaces.
Thakkar, S. S., and Knowles, A. E., "A High-Performance Memory Management Scheme", Computer, May 1986, pp. 8-19 and 22, discuss a number of conventional techniques for mapping virtual to real addresses. For example, the virtual address space can be divided into segments, and a segmented virtual address space could be provided for each process, with each virtual address including a process number field. Thakkar et al. discuss a segmented, paged virtual address space for each process in relation to the MUSS operating system, with a shared segment of the virtual address space of each process being accessible through a common segment table. Thakkar et al. also discuss DEC's VAX 11/780 system in which the virtual address divides the address space into system and user regions, selectable by the most significant virtual address bits, with a separate page table in main memory for each region. In this system, the entire page table need not be allocated in memory if it is not used, because its length is stored. Also, Sun Microsystem's Sun workstation performs address translation for a process by accessing the segment and page tables for that process in a high-speed memory. Thakkar et al. also discuss th MU6-G in which the page table size was reduced to cover only those pages currently resident in main memory, with a hardware page address register (PAR) being provided for every page in main memory. Although sharing of segments between all processes was possible with this technique, segments could not be shared between selected processes. In short, the prior art techniques described by Thakkar et al. do not provide flexible access to shared data.
It would be advantageous to have a space and time efficient address translation technique for use with multiple address spaces which would permit flexible access to shared data.