The present invention generally relates to processor emulation software, and more specifically to virtual memory paging during emulation of a target multiprocessor system.
Emulating a first computer architecture on a second computer architecture is a well known technique in the area of data processing. It is becoming more common as the cost of developing new generation computer architectures continues to escalate. A program, called an xe2x80x9cEmulatorxe2x80x9d, on a data processing system with the second computer architecture executes code designed for the first computer architecture: in essence pretending that it has the first computer architecture. The computer system having the second computer architecture and that executes the Emulator program is termed the xe2x80x9cHostxe2x80x9d computer system. A virtual computer system having the first (xe2x80x9cemulatedxe2x80x9d) computer architecture is termed the xe2x80x9cTargetxe2x80x9d system. Often both Target user and operating system software is executed together by the Emulator on the Host system, with the Target operating system managing resources for the Target user programs.
The computer industry has seen increasing interest in Legacy Emulation on commodity-based platforms over the past few years, such as the Intel IA-64 architecture. Through the literature and other sources, we are aware of about ten companies with research and/or development projects in this area. A common experience among them is recognition that the cost of emulation limits the achievable performance to equivalence with mid-range mainframe systems at this time. Only two claims we have seen set expected fairly high performance and these assume changes to the target OS to reduce the amount of code actually emulated in performing services for the target applications.
In order to strictly emulate the hardware instruction set of the Target system, software must transform the memory addresses generated by the emulation of the Target machine into addresses on the Host machine. One very critical element of Emulator performance is the matter of address mapping between the Target and Host systems. In virtual memory architectures, this critical element is the translation of Target virtual addressees into Host virtual addresses.
FIG. 1 is a block diagram illustrating an illustrative multiprocessor Host system utilized to emulate a Target system with a narrower word size. In the preferred embodiment disclosed below, the Host system utilizes 64-bit words, whereas the Target system supports 36-bit words. A multiprocessor system is shown in order to provide the level of performance necessary to emulate large-scale enterprise level Target systems. The multiprocessor system 40 shows two (2) microprocessors 42, each containing its own local cache memory 44. Some examples of microprocessors include Pentium II and IA-64 microprocessors from Intel Corporation, PowerPC microprocessors from Motorola, Inc. and IBM, and SPARC processors from Sun Microsystems. The cache memory 44 is typically implemented as extremely high-speed static random access memory (SRAM). The cache memory 44 may be implemented on the same semiconductor die as the microprocessor 42, or may be implemented as part of a multi-chip-module (MCM) with the microprocessor 42. In any case, the cache memory 44 for each microprocessor 42 is dedicated to that microprocessor 42. Note here that a single level of cache memory 44 is illustrative. Other cache memory configurations are within the scope of this invention. Note also that two microprocessors are shown. This is for illustrative purposes, and it is understood that the invention disclosed below envisions emulating a multiprocessor Target system on either a single processor or a multiprocessor Host system.
The two shown microprocessors 42 are coupled by and communicate over an intraprocessor bus 46. One of the functions of this intraprocessor bus 46 is to allow the two microprocessors 42 to communicate sufficiently so as to maintain coherence between their respective cache memories 44. A single bus has been shown. However, multiple busses are also within the scope of this invention.
Also coupled to the intraprocessor bus 46 is a Host bridge 50. This provides communications between the microprocessors 42 and the remainder of the computer system 40. Coupled to the Host Bridge 50 is Host memory 54. This is typically Dynamic Random Access Memory (DRAM). However, other types of memory may be utilized, including SRAM. Host memories 54 typically contain several orders of magnitude more memory than the cache memories 44.
Also coupled to the Host Bridge 50 is a system bus 60. The system bus 60 is utilized to couple the system 40 to lower speed peripheral devices. These lower speed peripheral devices can include display monitors, keyboards, communications devices, and the like (not shown here). Also coupled to the system bus are disk drives and other forms of storage capable of permanently storing data for the computer system 40. Shown in this figure are a Host disk drive 62 and a Target disk drive 68. The Host disk drive 62 typically contains the software required to emulate the Target system on the Host system. The Target disk drive 68 contains the software being emulated. It should be noted that the Host disk drive 62 is shown distinct from the Target disk drive 68. Additionally, only a single Host disk drive 62 and Target disk drive 68 are shown. It is shown this way for illustrative purposes. However, the present invention also envisions combining the two on shared drives. It must also be noted that the Target disk drive 68 will often actually consist of a large number of different physical disk drives. This is especially true when Host systems capable of supporting enterprise level databases are emulated.
Memory is considered herein a relatively high speed machine readable medium and includes Volatile Memories, such as DRAM 54, and SRAM 44, and Non-Volatile Memories (not shown) such as, ROM, FLASH, EPROM, EEPROM, and bubble memory. Secondary Storage 62, 68 includes machine-readable media such as hard disk drives, magnetic drum, and bubble memory. External Storage (not shown) includes machine-readable media such as floppy disks, removable hard drives, magnetic tape, CD-ROM, and even other computers, possibly connected via a communications line. The distinction drawn here between Secondary Storage 62, 68 and External Storage is primarily for convenience in describing the invention. As such, it should be appreciated that there is substantial functional overlap between these elements. Computer software such as Target emulation software and user programs can be stored in a Computer Software Storage Medium, such as Memory 44, 54, Secondary Storage 62, 68, and External Storage. Executable versions of computer software cane be read from a Non-Volatile Storage Medium such as External Storage (not shown), Secondary Storage 62, 68, and Non-Volatile Memory (not shown), and loaded for execution directly into Volatile Memory 44, 54, executed directly out of Non-Volatile Memory, or stored on the Secondary Storage 62, 68 prior to loading into Volatile Memory 44, 54 for execution.
Virtual memory provides a processor with an apparent or virtual memory address space typically much larger than the real memory actually employed. It also allows provides a contiguous address space employing discontiguous real memory pages. In the GCOS(copyright) 8 environment, this capability consists of a directly addressable virtual space of 2**43 bytes and the mechanisms for translating this virtual memory address into a real memory address.
The remainder of the Background section discusses virtual memory addressing in the GCOS 8 environment sold by assignee of this invention. In order to provide for virtual memory management, assignment, and control, the 2**43-byte virtual memory space is divided into smaller units called xe2x80x9cWorking Spacesxe2x80x9d and segments. The 2**43 bytes of virtual memory space are divided into 512 2**34 byte Working Spaces (WS). Each WS has a unique WS number (WSN). These Working Space numbers are used to generate a particular virtual memory address. They are obtained indirectly from one of the eight 9-bit WS registers, or directly from one of the descriptor registers. Each Working Space is further broken into 2**22 1024 (2**10)-word or 4096 (10**12)-byte virtual pages. Each virtual page, when present, will map to a physical or xe2x80x9crealxe2x80x9d page of the same size. Note that the GCOS 8 hardware, but currently not the GCOS 8 operating system, additionally supports an XV mode that provides for 18 bit Working Space numbers supporting 2**18 active Working Spaces.
A segment is a part of a Working Space and may be as small as one byte or as large as 2**32 bytes for an extended segment. Thus, unlike the fixed size of a Working Space (WS), a segment size is variable. Segments are addressed by a 72-bit data item called a xe2x80x9cdescriptorxe2x80x9d or a xe2x80x9csegment descriptorxe2x80x9d. Segments can be viewed as xe2x80x9cframingxe2x80x9d a portion of a Working Space. Multiple segments may frame different portions of the same Working Space, and may even overlap. Typically segments are setup by the operating system, but may be shrunk in size or otherwise reduced in capabilities by unprivileged user programs.
There are various segment types which discriminate on the type of data referenced and/or the sizes of the areas referenced. For the purposes of this description, the fact that all memory references are performed by a combination of a virtual address computation using content from these descriptors is of significant importance.
The following is an example of a virtual address computation using specifically a standard segment descriptor. This is for illustrative purposes only, and the present invention includes all virtual memory references independent of the use of descriptors.
When a virtual address is generated, a portion of the information comes from a segment descriptor contained in a register such as the instruction segment register (ISR). For operands, the descriptor may be contained in other segment descriptor registers. The area of virtual memory constituting a segment is xe2x80x9cframedxe2x80x9d by its segment descriptor by defining a base value relative to the base of the Working Space and a bound value relative to the base of the segment.
For all memory accesses, a virtual address must be generated. This includes operand or descriptor loads and stores, as well as instruction fetches. The mechanics of generating the virtual memory address depends on whether the involved segment descriptor is a standard segment descriptor or a super segment descriptor. Thus the procedures described below for generating an operand virtual address with a standard segment descriptor also applies to virtual address generation for accessing the instruction, argument, parameter, and linkage segments, since the registers holding these segment descriptors can only contain standard segment descriptors (with the exception of the instruction segment register (ISR) which may alternatively contain extended descriptors in EI mode).
FIG. 2 is a block diagram illustrating virtual address generation using a standard segment descriptor in standard mode in a GCOS 8 system. The effective address (EA) 110 is typically generated during instruction execution. Typically, during each instruction cycle two different effective addresses 110 are generated: the address of the instruction to fetch for execution, and an instruction operand address. The virtual address generation shown here must be done for both. The effective address (EA) 110 is typically generated differently for different types of instructions and instruction modification types. For example, the effective address (EA) 110 may be loaded from memory, generated directly from the instruction, or be calculated as the sum of one or more registers and a constant. The GCOS 8 architecture also supports an indirect addressing mode that provides that an operand address specifies the address of an operand address, or the address of an address of an operand address, etc.
The Effective Address (EA) 110 in NS mode consists of four parts: sixteen leading zeroes 112; an 18-bit effective word address 114; a 2-bit byte offset within word 116; a 4-bit bit offset within byte 118. The Effective Address (EA) 110 is added to a segment base address 120. The segment base address 120 comprises: a segment word address 124; and a segment byte offset 126. The segment base address is provided from one of the system segment registers discussed further in FIGS. 11-14. The summation 130 of the effective address (EA) plus the segment base comprises: a 2-bit Working Space modifier 132; a 32-bit word offset 134; and a 2-bit byte offset 136. The 2-bit Working Space modifier 132 is ORed with the lower 2-bits 139 of a 9-bit Working Space number 138 to generate an effective Working Space number 142. A 47-bit virtual address 140 is then generated comprising: the effective 9-bit Working Space number 142; a 32-bit word address within Working Space 144; a 2-bit byte offset within word 146; and a 4-bit bit offset within byte 148, from: the Working Space number 135 ORed with the Working Space number in the EA+Base 132; the EA+Base 134; and the bit offset in the Effective Address 118. It should be noted here that since the vast majority of GCOS 8 instructions executed do not utilize the virtual memory bit offset 148, it can be efficiently carried separately from the remainder of the virtual address 140 for those rare cases where it is needed.
The remainder of the Background section is used to illustrate virtual to real address translation. The example shown is for GCOS 8 NS mode address generation utilizing section tables in the virtual memory map hierarchy. This is in accordance with the preferred embodiment of the present invention. However, it should be noted that this description is illustrative. Many other computer architectures utilize similar virtual memory map hierarchies, and are within the scope of this invention.
FIG. 3 is a diagram illustrating the format of a virtual address when addressing a Working Space described by section tables (PTDW 150 T field 158=01). The virtual address 190 contains: a 9-bit effective Working Space number 182; a 12-bit section number 192; a 10-bit page number 194; a 10-bit word offset within page 186; a 2-bit byte offset within word 187; and a 4-bit bit offset within byte 188. The virtual address 190 in this FIG. 9 corresponds to the virtual address 140 shown in FIGS. 2 and 3.
FIG. 4 is a block diagram that illustrates virtual address mapping 220 using a section table in the GCOS 8 architecture using the virtual address format shown in FIG. 3. A page directory base register (PDBR) 202 contains a pointer to a Working Space Page Table Directory (WSPTD) 204. The WSPTD 204 contains an array of Page Table Directory Words (PTDW) 150 (see FIG. 4). The effective Working Space number 182 is utilized to index into the WSPTD 204 in order to select the appropriate PTDW 206. The selected Page Table Directory Word (PTDW) 206 in turn addresses a section table (SCT) 222. The section table (SCT) 222 contains Page Table Base Words (PBW) 164 (see FIG. 5). The section number 192 is utilized to index into the section table (SCT) 222 to address a Page Table Base Word (PBW) 224. The selected PBW 224 addresses a Page Table (PT) 212. Page Tables (PT) 212 contain Page Table Words 170 (see FIG. 6). The page number 194 portion of the virtual address 190 is utilized to index into the Page Table (PT) 212 to select the appropriate Page Table Word 214. The selected Page Table Word (PTW) 214 addresses one page of real memory. The word offset 186 portion of the virtual address 190 is then utilized to index into the selected page of memory 216 to address the selected word 218. The byte 187 and bit 188 offsets of the virtual address 190 are then utilized to index into the selected word 218, when necessary.
It should be noted that there are various flags and fields in each of the control words throughout this hierarchy. The use of this invention will preclude not only the necessity, but the possibility of access to these flags and fields by the emulated operating system. These data elements fall into two categories: (1) those employed for the purposes of managing the real memory mapped to the virtual space, and (2) those employed to control access to the virtual memory for purposes of security and data integrity. An example of the first category is a flag employed to indicate that a page of memory has been accessed. An example of the second category is the CPU write flag in the PTW.
Since the management of the virtual memory devolves to the host operating system, direct access to flags and fields of the first class is not necessary. Note though that this implies that the hosted operating system can operate within the virtual memory management policies enforced by the host operating system.
The emulated hardware will typically need to continue to enforce the access rights for emulated memory accesses. Therefore, it is preferred that the host hardware supplies a superset of the access control required by the emulated hardware and that the host operating system supplies suitable mechanisms for the emulation to exert the policies of the emulated operating system. In most cases this capability will not extend beyond the ability to control read and write access to individual pages and the ability to xe2x80x9cpinxe2x80x9d pages of memory so that they are not eligible for swap. Some implementations may have additional requirements such as the mapping of contiguous real memory pages for the benefit of hardware drivers. These details are beyond the scope of this description. It is assumed that anyone employing this invention has separately addressed this issue.
FIG. 5 is a diagram of the format of a page table directory word (PTDW) 150 in the GCOS 8 architecture.
The PCT/SCT base 152 is a modulo 1024 (2**10) base address of a page table (PT) or section table (SCT). The PT/SCT size 162 field contains different information depending on the type of page table involved. For a dense page table (T=00), bits 24-35 indicate the modulo 64 size of the page table (PT). For a section table (T=01), bits 30-35 indicate the modulo 64 size of the SCT. Fragmented page tables (T=10) are not supported by the GCOS 8 operating system. If bits 30-35 are zero, a size of 64 words are assumed, and bits 24 through 29 are ignored.
FIG. 6 is a diagram of the format of a page table base word (PBW) 164 in the GCOS 8 architecture. Page table base words (PBW) 164 are utilized to address page tables (PT) and are the entries in a section table (SCT). The format of a 36-bit page table base word (PBW) 164 is shown in table T-2:
The PT base field 152 contains the modulo 1024 (2**10) base address of a dense page table. The PT size field 162 contains the modulo 64 size of a dense page table. If it is zero, a page table size of 64 words is assumed.
FIG. 7 is a diagram of the format of a page table word (PTW) 170 and are the entries in a page table (PT) in the GCOS 8 architecture. Each page table word (PTW) 170 describes one page of real memory. The format of a 36-bit page table word (PTW) 170 is shown in table T-3:
The real memory address field contains the real address of the Memory Page.
FIG. 8 is a diagram that illustrates the contents of segment descriptor registers in a GCOS 8 environment. Thirteen segment descriptor registers are supported in the GCOS 8 architecture, and they are: eight Segment Descriptor Registers (DR0 through DR7) for operand addressing; an Argument Stack Register (ASR); a Data Stack Descriptor Register (DSDR); an Instruction Segment Register (ISR); a Linkage Segment Register (LSR); and a Parameter Segment Register (PSR). In the GCOS 8 environment, segment descriptors are 72-bits in size and are used to describe a contiguous subset of a Working Space.
FIG. 8 is a diagram illustrating the segment register representation of a standard segment descriptor. This is representative of the other types of segments supported by the GCOS 8 architecture. The segment register representation 302 comprises two 36-bit words stored in two words of memory or in a single 72-bit register. The format of the segment register representation is shown in table T-4:
The 3-bit Working Space Register (WSR) 314 field designates one of eight 9-bit Working Space registers. The contents of the selected WSR 314 are retrieved and used as the Working Space for the segment. The 20-bit bound field 324 contains the maximum valid byte address within the segment. The 36-bit base field 318 contains a virtual byte address that is relative to the start of the designated Working Space defined by the WSR 314. Bits 0:33 are a 34-bit word address, and bits 34:35 identifying a 9-bit byte within the word.