The clock cycle of a microprocessor ultimately determines the speed of the microprocessor for its various applications. In microprocessor designs, several timing paths may ultimately determine the clock cycle. Consequently, it is important to optimize all microprocessor timing paths in order to make the clock cycle as fast as possible and, thereby, improve the microprocessor's performance.
One of the critical timing paths for a microprocessor is its memory access timing path. The memory access timing path for a microprocessor is the time that it takes the microprocessor to retrieve data and instructions from a memory. The microprocessor uses these data and instructions in further processing. In numerous applications, microprocessor chips contain on the chip a cache memory. A cache memory temporarily stores data and instructions that the processor has most recently used in the expectation that the processor will use this information again soon. By having the most recently used data and instructions in a cache memory, the microprocessor may rapidly access these data and instructions without having to retrieve them from main memory. For many applications, the cache memory access timing path constitutes a major portion of the total memory access timing path. Therefore, the cache memory timing path often strongly affects critical timing paths for microprocessor processing.
Computer systems use virtual memory to enable them to work on problems in which too much data is present to fit into the available physical memory (RAM). The virtual and physical memory address spaces are divided into blocks called pages. Virtual pages, which are stored on disk, are mapped into physical pages stored in RAM, so that they may be accessed by the computer's CPU. Computer systems use a module called a memory management unit (MMU) to perform this mapping from virtual to physical addresses. This operation is called address translation.
Data and instruction Caches store address tags, which must match the incoming address in order for the cache to successfully return the desired data or instruction. These address tags may consist of either a virtual address or a physical address, depending on the design of the computer system. The CPU operates using virtual addresses. If a cache stores physical address tags, the incoming address to the cache must be a physical address, and therefore suffers a time delay when it is translated from the original virtual address by the MMU.
It is desirable to store virtual address tags rather than physical address tags in the cache to avoid this address translation delay. Therefore, virtually addressed caches can have a smaller total time delay than physically addressed caches. Unfortunately, often software and operating system considerations force the use of physically addressed caches, with their increased delays. With the UNIX operating system in particular, it is usually preferable to use physically addressed caches.
In physically addressed instruction and data caches, address comparison is performed between the address tags stored within the cache and the physical address presented to the cache. If the time delay in generating the physical address is larger than the time delay in fetching the address tag from within the cache, the cache will suffer an additional wait time due to the address translation. A method and system that expedite the combined address translation and cache matching process will reduce the timing path in the cache memory system.
Thus, there is a need for a method and system that reduces the combined address translation and cache matching process time.