Not Applicable.
The present embodiments relate to microprocessors and microprocessor-based systems, and are more particularly directed to microprocessor circuits, systems, and methods with a combined on-chip pixel and non-pixel cache structure.
Microprocessor technology continues to advance at a rapid pace, with consideration given to all aspects of design. Designers constantly strive to increase performance, while maximizing efficiency. With respect to performance, greater overall microprocessor speed is achieved by improving the speed of various related and unrelated microprocessor circuits and operations. For example, one area in which operational efficiency is improved is by providing parallel and out-of-order instruction execution. As another example, operational efficiency also is improved by providing faster and greater access to information, with such information including instructions and/or data. The present embodiments are primarily directed at this access capability and, more particularly, to improving and equalizing access by the same microprocessor to various types of information including instructions, non-pixel data, and pixel data.
One common approach in the field of modem high performance data processing systems is to implement the system using a single-chip microprocessor as the central processing unit (CPU), and using external semiconductor random-access memory (RAM) as main system memory. The main system memory is generally implemented in the form of random access memory (RAM) devices such as dynamic RAM (DRAM), which are of high density and low cost-per-bit; however, the latency and bandwidth of conventional DRAM memory are sometimes less than desirable or acceptable, and often are not able to keep up with the clock rates of modem microprocessors. Thus, other memory considerations are also now involved when developing additional aspects of the system design as better appreciated below.
Another very common approach in modem computer systems directed at improving access time to information is to include one or more levels of cache memory within the system, where this approach is substantially faster as compared to access to data in main memory. Cache memories are typically relatively small blocks of high speed static RAM (SRAM), either on-chip with the microprocessor or off-chip (or both), for storing the contents of memory locations that are likely to be accessed in the near future. Typically, cache memory also stores the contents of memory locations that are near neighbors to a memory location that was recently accessed; because microprocessors often access memory in a sequential fashion, it is likely that successive memory accesses in successive cycles will access memory addresses that are very close to one another in the memory space. Accordingly, by storing the neighboring memory location contents in a cache, a good portion of the memory accesses may be made by the microprocessor to cache, rather than to main memory. The overall performance of the system is thus improved through the implementation of one or more cache memories. Most modem microprocessor systems include multiple levels of cache memory (either on or off-chip), with the capacity of the cache increasing (and its speed decreasing) with each successive level, to optimize performance. Typically, the lowest level cache (i.e., the first to be accessed) is smaller and faster than the cache or caches above it in the hierarchy, and the number of caches in a given memory hierarchy may vary. In any event, when utilizing the cache hierarchy, when an information address is issued, the address is typically directed to the lowest level cache to see if that cache stores information corresponding to that address, that is, whether there is a xe2x80x9chitxe2x80x9d in that cache. If a hit occurs, then the addressed information is retrieved from the cache without having to access a memory higher in the memory hierarchy, where that higher ordered memory is likely slower to access than the hit cache memory. On the other hand, if a cache hit does not occur, then it is said that a cache xe2x80x9cmissxe2x80x9d occurs. In response, the next higher ordered memory structure is presented with the address at issue. This action may occur after, or during the same time with, the addressing of the lower level cache. If this next higher ordered memory structure is another cache, then once again a hit or miss may occur. If misses occur at each cache, then eventually the process reaches the highest ordered memory structure in the system, at which point the addressed information may be retrieved from that memory.
By way of further background, another manner of improving efficiencies with respect to modern computers is through the use of a so-called unified memory architecture (xe2x80x9cUMAxe2x80x9d). More particularly, one factor in the overall system costs includes the various types and number of memory structures, including the cache systems mentioned above. However, another consideration is the implementation of what is sometimes referred to as video memory or pixel memory, that is, the type of storage resource utilized for storing pixel data (i.e., data for driving some type of image display such as a cathode ray tube monitor or other type of display). Under a UMA system, the pixel data is mapped directly to, and stored in, the system main memory. This choice is an alternative to providing a separate pixel memory, which is typically external from the microprocessor and dedicated solely for inputting and outputting pixel data. Therefore, the UMA system eliminates the need or existence in the system of this additional memory structure, where that structure is dedicated solely for pixel data. This approach is typically perceived as favorable because, despite the potential slower access to main memory, the cost of larger main memory is typically considerably less than requiring a separate pixel memory. Note, however, that the UMA system may be considered to have certain drawbacks in particular contexts. For example, because of its direct mapping of the pixel data, a fixed amount of the address space of the system main memory is unavailable for other types of data, because that address space is necessarily dedicated to pixel data. As another example, typically the system main memory is accessible only via a single bus and, therefore, a single access at a time may be made only to one or the other of the pixel data or the non-pixel data stored in the memory structure.
In view of the above, the present inventors have recognized various limitations of the above factors regarding memory systems. Thus, below are presented various inventive embodiments which permit improve efficiency in various contexts as measured against these prior art drawbacks as well as others which will be appreciated by one skilled in the art.
In one embodiment, there is a computer system comprising a central processing unit and a memory hierarchy. The memory hierarchy includes a first cache memory and a second cache memory. The first cache memory is operable to store non-pixel information, wherein the non-pixel information is accessible for processing by the central processing unit. The second cache memory is higher in the memory hierarchy than the first cache memory, and has a number of storage locations operable to store non-pixel information and pixel data. Lastly, the computer system comprises cache control circuitry for dynamically apportioning the number of storage locations of the second cache memory such that a first group of the storage locations are for storing non-pixel information and such that a second group of the storage locations are for storing pixel data. Other circuits, systems, and methods are also disclosed and claimed.