1. Field
The present disclosure relates generally to processing systems, and more specifically, to caching instructions for a multiple-state processor.
2. Background
Computers typically employ a processor supported by memory. Memory is a storage medium that holds the programs and data needed by the processor to perform its functions. Recently, with the advent of more powerful software programs, the demands on memory have been increasing at an astounding rate. The result is that modern processors require a large amount of memory, which is inherently slower than the smaller memories. Large memories with speeds capable of supporting today's processors are simply too expensive for large scale commercial applications.
Computer designers have addressed this problem by organizing memory into several hierarchal components. The largest component, in terms of capacity, is commonly a hard drive. The hard drive provides large quantities of inexpensive permanent storage. The basic input/output system (BIOS) and the operating system are just a few examples of programs that are typically stored on the hard drive. These programs may be loaded into Random Access Memory (RAM) when the computer is operational. Software applications that are launched by a user may also be loaded into RAM from the hard drive. RAM is a temporary storage area that allows the processor to access the information more readily.
The computer's RAM is still not fast enough to keep up with the processor. This means that processors may have to wait for program instructions and data to be written to and read from the RAM. Caches are used to increase the speed of memory access by making the information most often used by the processor readily available. This is accomplished by integrating a small amount of memory, known as a primary or Level 1 (L1) cache, into the processor. A secondary or Level 2 (L2) cache between the RAM and L1 cache may also be used in some computer applications.
The speed of the computer may be further improved by partially decoding the instructions prior to being placed into the cache. This process is often referred to as “pre-decoding,” and entails generating some “pre-decode information” that can be stored along with the instruction in the cache. The pre-decode information indicates some basic aspects of the instruction such as whether the instruction is an arithmetic or storage instruction, whether the instruction is a branch instruction, whether the instruction will make a memory reference, or any other information that may be used by the processor to reduce the complexity of the decode logic. Pre-decoding instructions improves processor performance by reducing the length of the machine's pipeline without reducing the frequency at which it operates.
Processors capable of operating in multiple states are becoming commonplace with today's emerging technology. A “multiple state processor” means a processor that can support two or more different instruction sets. The ARM (Advance RISC Machine) processor, as sold by ARM limited, is just one example. The ARM processor is an efficient, low-power RISC processor that is commonly used today in mobile applications such as mobile telephones, personal digital assistants (PDA), digital camera, and game consoles, just to name a few. ARM processors have historically supported two instruction sets: the ARM instruction set, in which all instructions are 32-bits long, and the Thumb instruction set, which compresses the most commonly used instructions into a 16-bit format. A third instruction set that has recently been added to some ARM processors is “THUMB-2 Execution Environment” (T2EE). T2EE is an instruction set (similar to THUMB) that is optimized as a dynamic (JIT) compilation target for bytecode languages, such as Java and NET.
These multiple-state processors have significantly increased the capability of modern day computing systems, but can pose unique challenges to the computer designer. By way of example, if a block of instructions the size of one line in the L1 instruction cache contains instructions from multiple instruction sets, pre-decode information calculated assuming that the entire cache line contains instructions in one state cannot be used for those instructions that are actually in the other state. The solution described in this disclosure is not limited to ARM processors with THUMB and/or T2EE capability, but may be applied to any system that pre-decodes instructions for multiple instruction sets with overlapping instruction encodings prior to placing them into cache.