This invention relates in general to the field of microprocessor bus transaction ordering, and more particularly to caching of memory region type information for specifying such ordering.
Many modern microprocessors support the notion of virtual memory. In a virtual memory system, instructions of a program executing on the microprocessor refer to data using virtual addresses in a virtual address space of the microprocessor. The virtual address space is typically much larger than the actual physical memory space of the system, and in particular, the amount of virtual memory is typically much greater than the amount of physical memory present in the system. The virtual addresses generated by the program instructions are translated into physical addresses that are provided on a processor bus coupled to the microprocessor in order to access system memory or other devices, such as I/O devices.
A common virtual memory scheme supported by microprocessors is a paged memory system. A paged memory system employs a paging mechanism for translating, or mapping, virtual addresses to physical addresses. The physical address space of the processor bus is divided up into physical pages of fixed size. A common page size is 4 KB. The virtual addresses comprise a virtual page number portion and a page offset portion. The virtual page number specifies a virtual page in the virtual address space. The virtual page number is translated by the paging mechanism into a physical page address, i.e., a physical address on the processor bus of the page. The physical page address is commonly referred to as a page base address. The page offset specifies a physical offset in the physical page, i.e., a physical offset from the page base address.
The advantages of memory paging are well known. One example of a benefit of memory paging systems is that they enable programs to execute with a larger virtual memory space than physically exists. Another benefit is that memory paging facilitates relocation of programs in different physical memory locations during different or multiple executions of the program. Another benefit of memory paging is that it allows multiple processes to execute on the processor simultaneously, each having its own allocated physical memory pages to access without having to be swapped in from disk, and without having to dedicate the full physical memory to one process. Another benefit is that memory paging facilitates memory protection from other processes on a page basis.
Page translation, i.e., translation of the virtual page number to the page base address, is accomplished by what is commonly referred to as a page table walk. Typically, the operating system maintains page tables that contain information for translating the virtual page number to a page base address. Typically, the page tables reside in system memory. Hence, it is a relatively costly operation to perform a page table walk, since multiple memory accesses must typically be performed to do the translation.
To improve performance by reducing the number of page table walks, many microprocessors provide a mechanism for caching page table information, which includes page base addresses translated from frequently used virtual page numbers. The caching mechanism is commonly referred to as a translation lookaside buffer (TLB) The virtual page number is provided to the TLB, and the TLB performs a lookup of the virtual page number. If the virtual page number hits in the TLB, then the TLB provides the corresponding translated page base address, thereby avoiding the need to perform a page table walk to translate the virtual page number to the page base address. The page base address is concatenated with the page offset to generate a physical address supplied on the processor bus as part of a bus request to transfer data to or from the microprocessor.
In a typical microprocessor system, devices of different types are coupled to the microprocessor bus, or some bus lower in the bus hierarchy of the system. Examples of the devices are system memory (commonly DRAM), ROM, and memory-mapped I/O devices, such as video controller frame buffers, or storage device control and status registers. The devices are addressed by physical addresses provided on the processor bus that are translated from virtual addresses as described above.
The various types of memory or devices accessed by the microprocessor have different attributes that affect the manner in which accesses to the memory or devices may be performed. For example, consider the case of a memory-mapped I/O device. Assume a store to a memory-mapped control register in a disk controller is followed by a load from a memory-mapped status register in the disk controller. In this situation, the processor bus request associated with the load must not be issued until the bus request associated with the store has completed, or else proper program operation may not occur. In contrast, it is typically desirable to allow accesses to different locations in system memory DRAM to be performed out-of-order and to be write-back cacheable. As a third example, it is typically desirable for reads from video frame buffers to not be cached to improve cache performance, and for writes to be delayed to allow for combining of multiple writes to the frame buffers to enhance write throughput.
Typically, a microprocessor provides a means for the operating system to specify a memory type associated with specified ranges of the processor bus space. That is, the microprocessor provides a mechanism for mapping a physical address range of the processor bus to a memory type, or device type, of memory or devices occupying the address range. The memory type specifies cache attributes associated with the address range, such as whether the address range is cacheable or uncacheable, write-back or write-through, writeable or write-protected, and whether write-combining is allowed. The characteristics specified by the memory type may also control whether the specified address range supports out-of-order execution or speculative accesses.
The circuit in the microprocessor for mapping a physical address on the processor bus to a memory type is commonly referred to as a memory type unit (MTU). The MTU receives a physical address and provides the memory type associated with the memory range in which the physical address lies. The MTU must operate on physical addresses independent of the virtual to physical address mapping used to generate the physical address. Because the MTU must operate on a physical address, when paging is enabled, the TLB lookup to produce the physical address and the MTU lookup are serialized. That is, the total time required to obtain the memory type in order to determine whether a load or store may proceed to the processor bus is at best the sum of the TLB lookup time plus the MTU lookup time.
However, it may be known sooner than the sum of the TLB and MTU lookup times that the load or store needs to generate a processor bus access. For example, assume the processor data cache indicates that a load address misses in the data cache, thus requiring a read from system memory on the processor bus. The data cache may generate the miss well before the memory type is available, due to the serialized lookup times of the TLB and MTU. This is detrimental to performance, particularly since accesses to system memory or other devices accessed through the processor bus may be relatively lengthy, and hence, should be initiated as soon as possible. Therefore, what is needed is a way to reduce the time required to determine the memory type.
The present invention provides an apparatus and method for caching memory types in the TLB of the processor in order to reduce the time required to obtain the memory type. Accordingly, in attainment of the aforementioned object, it is a feature of the present invention to provide a translation lookaside buffer (TLB) for caching memory types. The TLB includes an input that receives a virtual address. The TLB also includes a tag array, coupled to the input, which caches virtual addresses. The TLB also includes a data array, coupled to the input, which caches physical addresses translated from corresponding ones of the virtual addresses cached in the tag array, and which caches a memory type associated with each of the physical addresses. The TLB also includes an output, coupled to the data array, which provides the memory type associated with one of the physical addresses from the data array selected by the virtual address received on the input.
In another aspect, it is a feature of the present invention to provide a data unit in a microprocessor having a processor bus. The data unit includes a memory type unit (MTU) that stores physical memory ranges and memory types associated with the physical memory ranges. The data unit also includes a translation lookaside buffer (TLB), coupled to the MTU, which caches page table entries, and which caches memory types from the MTU associated with physical addresses in the page table entries.
In another aspect, it is a feature of the present invention to provide a microprocessor. The microprocessor includes a bus interface unit (BIU), coupled to a bus external to the microprocessor, which issues requests on the bus. The microprocessor also includes a memory type unit (MTU), coupled to the BIU, which stores memory types associated with address ranges of the bus. The memory types specify caching characteristics of the requests on the bus in each of the address ranges. The microprocessor also includes a translation lookaside buffer (TLB), coupled to the BIU, which caches virtual memory addresses, and which caches corresponding addresses of the bus translated from the virtual memory addresses, and which caches one of the memory types stored in the MTU for each of the addresses of the bus.
In another aspect, it is a feature of the present invention to provide a method of providing a memory type for a physical address range in a microprocessor. The method includes detecting a miss of a virtual address in a translation lookaside buffer (TLB), translating the virtual address into a physical address in response to the miss, providing a memory type of the physical address in response to the physical address, and caching the memory type in the TLB in association with the virtual address.
In another aspect, it is a feature of the present invention to provide a method of providing a memory type for a physical address range in a microprocessor. The method includes caching a plurality of physical addresses translated from a plurality of virtual address in a translation lookaside buffer (TLB) based on virtual addresses, and caching in the TLB a plurality of memory types associated with the plurality of physical addresses. The method also includes applying a virtual address to the TLB, and providing one of the plurality of memory types cached in the TLB associated with one of the plurality of physical addresses based on the virtual address applied to the TLB.
An advantage of the present invention is that it eliminates the need, in the typical case of a TLB hit, for the memory type unit (MTU) to perform its lookup of the physical address to obtain a memory type (MT) for the address. Consequently, the time required to determine whether a condition exists that requires blocking access on the processor bus to the physical address is reduced in the typical case. Consequently, the processor may potentially access the processor bus sooner than in the prior method. Another advantage of the present invention is that it alleviates the need to add another pipeline stage to accommodate the MTU lookup of the memory type of the prior method. The addition of another pipeline stage is detrimental to processor performance in the event of a mispredicted branch, since another stage of branch penalty would be introduced. Finally, the present invention alleviates the need to increase the clock cycle time of the processor to accommodate the MTU lookup of the memory type of the prior method.
Other features and advantages of the present invention will become apparent upon study of the remaining portions of the specification and drawings.