The present invention relates generally to an integrated microprocessor system having a cache memory. In particular, the present invention relates to an apparatus and method for invalidation of all cache lines within the cache memory within a single clock cycle.
In an Integrated Microprocessor system, for example, as described in U.S. patent application Ser. No. 09/1031,318 entitled Cache Divided for Processor Core and Pixel Engine Uses filed Feb. 25, 1998 by Gary Peled et al (xe2x80x9ccache sharing patentxe2x80x9d) now abandoned;, the level-two cache memory is shared between the CPU (Central Processing Unit) and the graphics engine. Referring to FIG. 1, an Integrated Microprocessor System 100 in accordance with the teaching of the cache sharing patent includes a CPU 102, a main memory controller 104 and a graphics engine 108, which are all integrated on a single die. The microprocessor 102, the graphics engine 108, the cache 110, and a DRAM (Dynamic Random Access Memory) main memory 106 are all coupled to a bus 114 in order to communicate information back and forth between the various components of the system 100. Also coupled to the bus 114 is an I/O controller 116 which, as shown in FIG. 1 supports two input/output devices 118 and 120. Conventionally, the cache 110, which is sometimes referred to as a level-two cache (L2 cache), and the cache 112, which is sometimes referred to as a level-one cache (L1 cache), may be used to store a small subset of the information resident in the DRAM 106 in order to speed up the operation of the system 100.
However, in accordance with the cache sharing patent, a shared cache array 110 in the Integrated Microprocessor System 100 is the cache farthest from the processor core, that is shared between the CPU requests and the graphics requests. The shared cache 110 can be in CPU only mode or in a dual CPU/Graphics mode. During CPU only mode, the entire cache is available for CPU access. While during dual CPU/Graphics mode, a predetermined portion of the shared cache 110 is available for the CPU and the remainder of the shared cache 110 is utilized for graphics requests. A process for switching from CPU only mode to the dual CPU/Graphics mode or vice versa is referred to as Context Switching. These concepts are used in the background and discussion of embodiments of the present invention, and apply to both.
Unfortunately, Context Switching can be a very time consuming process depending on the size of the shared cache 110. Conventionally, a context switch requires write back of all the modified cache lines to the main memory 106 and invalidation of all the cache lines in the shared cache 110. The same thing is true for a graphics request like FLUSH. For a graphics request like CLEAR, invalidation of all the lines in the shared cache 110 is required. CPU instructions WBINVD (Write Back Invalidate) writes back all the cache lines and then invalidates the entire cache 110 whereas the instruction INVD (Invalidate) just invalidates the entire cache 110.
Normally, writing back all the modified cache lines to the main memory and invalidating the entire cache 110 is done either with microcode routines or with dedicated hardware like Finite State Machines (FSM). Both of these methods use micro operations that take multiple CPU clock cycles per cache line. Unfortunately, the disadvantage with these methods is that invalidating each line takes a few clock cycles and invalidating the entire cache can take thousands of clock cycles or more depending on the cache size.
Therefore, there remains a need to overcome the limitations in the above described existing art, which is satisfied by the inventive structure and method described hereinafter.