1. Technical Field
The present invention relates in general to a method and system for addressing 64 bit memory addresses. More particularly, the present invention relates to a system and method for using a 32 bit address for branching into a 64 bit target address.
2. Description of the Related Art
Computer systems in general and International Business Machines (IBM) compatible personal computer systems in particular have attained widespread use for providing computer power to many segments of today""s modern society. Systems with microprocessors are finding themselves in an array of smaller and more specialized objects that previously were largely untouched by computer technology. Computer systems typically include a system processor and associated volatile and non-volatile memory, a display area, input means, and often interfaces, such as a network interface or modem, to other computing devices.
These computing devices are information handling systems which are designed primarily to give independent computing power to a single user, or a group of users in the case of networked computing devices. Personal computing devices are often inexpensively priced for purchase by individuals or businesses. Nonvolatile storage devices such as hard disks, CD-ROM drives and magneto-optical drives are considered to be peripheral devices. Computing devices are often linked to one another using a network, such as a local area network (LAN), wide area network (WAN), or other type of network, such as the Internet.
One of the distinguishing characteristics of these systems is the use of a system board to electrically connect these components together. At the heart of the system board is one or more processors. System manufacturers continually strive for faster, more powerful processors in order to supply systems for demanding applications. Processors, in turn, have evolved from simple 8 bit microprocessors all the way to current 64 bit processors. The addressable memory of these processors has likewise grown exponentially. Thirty-two bit microprocessors, such as IBM""s 32-bit Power PC processor and Intel""s Pentium processor, could access 232 bytes of virtual memory (4 gigabytes). Meanwhile, 64 bit processors, such as Intel""s IA-64 can access 264 bytes of virtual memory. The Intel IA-64 processor divides its memory into 8 separate regions. The three highest order bits (bits 63, 62, and 61) of a memory address determine the memory region that will be used, as set forth below:
While increasing the power and addressable memory space is advantageous, a large number of programs have already been written to operate in 32 bit environments. In a 64 bit architecture, a 32 bit address only uses the low 32 bits of the possible 64 bits. Therefore, the high order bits, including the bits determining the memory region, will be equal to zero (0). As a result, 32 bit programs only address memory in the first 4 GB memory area in the first region (region 0).
One of the reasons memory regions were established in the IA-64 architecture was to promote shared memory. If certain data, such as shared library text or read-only data, is used by more than one program running in the IA-64 architecture, a single copy can be placed in a region and shared amongst the programs. The MMU is used to fetch the data from physical memory.
The MMU (memory management unit), is a hardware component that manages virtual memory systems, including the virtual memory in the IA-64 architecture. Typically, the MMU is part of the CPU, though in some designs it is a separate chip. The MMU includes a small amount of memory that holds a table matching virtual addresses to physical addresses. This table is called the page table. All requests for data are sent to the MMU, which determines whether the data is in RAM or needs to be fetched from a nonvolatile storage device, such as a hard disk drive. If the data is not in memory, the MMU issues a page fault interrupt.
While it may be advantageous to use different regions for different types of shared memory, a challenge exists with legacy 32 bit programs in accessing these shared memory areas. The region where the 32 bit program is running must have its own copy of these shared memory areas in order to read from them. Translation look aside buffers (TLBs) are hardware storage registers that keep the most recently used translations for loaded page tables entries. Maintaining additional copies of shared memory in order to provide access to 32 bit programs is costly in terms of increased context switching and increased thrashing of translation hardware facilities such as page table entries and TLBs.
Other approaches have been developed to allow 32 bit programs to run within a 64 bit environment. The first approach is to zero extend the 32 bit addresses to be 64 bits in length. With this solution, however, bits 63-61 (the bits that determine the memory region) will equal 0 and, consequently, will always address memory region 0.
A second approach is to use the two highest order bits in the 32 bit address (bits 31 and 30) and use the bit values for the high order bits in the 64 bit address (bits 61 and 62). This approach does allow use of four memory regions (due to setting 2 of the 3 bits that determine a region index), however this approach introduces new challenges. Because the two high order bits of the 32 bit address are used for bits 63 and 62, only 30 bits remain from the original 32 bit address. Therefore, only 1 GB is addressable in each of the accessible regions. In addition, because only 2 bits are used, only 4 of the 8 memory regions are accessible. Shared memory in inaccessible regions still needs to have a copy maintained in an accessible region in order to be read by the 32 bit program. Moreover, because the 4 GB memory space from the 32 bit environment is spread across four regions, programs stepping through memory need to be redirected to another region when one of the 1 GB memory boundaries is crossed.
[ADD PROBLEM WHERE 0 starts at 0, REGION 1 STARTS AT 1GB, etc.]
A third approach is to sign-extend the 32 bit address. This approach allows access to two memory regions (region 0 if the high order bit (bit 31) is zero, and region 7 if the high order bit is one). This approach, again, presents significant challenges. Each of these two memory areas is only 2 GB in size (231), rather than 4 GB because the high order bit is sign extended to determine the region index. Also, only two of the possible eight regions are accessible. If shared memory exists in any of the other six regions, a copy of the shared memory is needed in one of the two accessible regions. Moreover, because the 4 GB memory space from the 32 bit environment is spread across two regions, programs stepping through memory need to be redirected to another region when the 2 GB memory boundaries is crossed.
What is needed, therefore, is a method for 32 bit applications to access any region in a 64 bit environment without losing the normal contiguous 4 GB address footprint of a 32 bit process.
It has been discovered that a region index can be stored in low order bits (bits 0, 1, and 2) of a 32 bit address with minimal performance impact. A region index is stored in the high order bits of a 64 bit address. In the IA-64 architecture, there are eight memory regions and thus three corresponding high order bits used to address the desired memory region.
In some 64 bit architectures, including the IA-64 architecture, code entry points are set at certain minimal intervals to improve performance and system management. For example, the IA-64 architecture specifies that code entry points are on 16 byte boundaries. Because of this specification, the 4 low-order bits of a calling address are ignored since those bits would determine an address within a 16 byte region. In the present invention, these unused bits locations are used when processing 32 bit programs to store a region index in order to address a different memory region. In this manner, shared libraries and other shared data residing in other regions can be accessed by 32 bit programs, thus improving the performance of 32 bit programs operating in a 64 bit operating environment. External program calls are processed by the run-time linker. External program calls such as library calls and system calls are updated to store the region index of the corresponding library call or system call in the low 3 bits. Control is then passed to the 32 bit program""s entry point where the instructions are processed. When the instructions are processed, out of module calls use glue code to bind the 32 bit instructions to the 64 bit operating environment. During the execution of the glue code, the region index that is stored in the 3 low order bits is copied into the high order bits of the corresponding 64 bit instruction. After the bits have been copied, the program branches to the 64 bit address.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.