1. Technical Field
The present invention generally relates to an improved data processing system and in particular to improved memory management in a data processing system. Still more particularly, the present invention relates to improved cache memory management in a data processing system, which an improved system bus response protocol.
2. Description of the Related Art
Most data processing systems are controlled by one or more processors and employ various levels of memory. Typically, programs and data are loaded into a data processing system""s memory storage areas for execution or reference by the processor, and are stored in different portions of the memory storage depending on the processor""s current need for such programs or data. A running program or data referenced by a running program must be within the system""s main memory (primary or main storage, which is typically random access memory). Programs or data which are not needed immediately may be kept in secondary memory (secondary storage, such as a tape or disk drive) until needed, and then brought into main storage for execution or reference. Secondary storage media are generally less costly than random access memory components and have much greater capacity, while main memory storage may generally be accessed much faster than secondary memory.
Within the system storage hierarchy, one or more levels of high-speed cache memory may be employed between the processor and main memory to improve performance and utilization. Cache storage is much faster than the main memory, but is also relatively expensive as compared to main memory and is therefore typically employed only in relatively small amounts within a data processing system. In addition, limiting the size of cache storage enhances the speed of the cache. Various levels of cache memory are often employed, with trade-offs between size and access being made at levels logically further from the processor(s). Cache memory generally operates faster than main memory, typically by a factor of five to ten times, and may, under certain circumstances, approach the processor operational speed. If program instructions and/or data which are required during execution are pre-loaded in high speed cache memory, average overall memory access time for the system will approach the access time of the cache.
In order to enhance performance, contemporary data processing systems often utilize multiple processors which concurrently execute portions of a given task. To further enhance performance, such multiple processor or multi-processor (MP) data processing systems often utilize a multi-level cache/memory hierarchy to reduce the access time required to retrieve data from memory. A multi-processor system may include a number of processors each with an associated on-chip, level-one (L1) cache, a number of level-two (L2) caches, and a number of system memory modules. Typically, the cache/memory hierarchy is arranged such that each L2 cache is accessed by a subset of the L1 caches within the system via a local bus. In turn, each L2 cache and system memory module is coupled to a system bus (or interconnect switch) such that an L2 cache within the multi-processor system may access data from any of the system memory modules coupled to the bus.
The use of cache memory imposes one more level of memory management overhead on the data processing system. Logic must be implemented to control allocation, deallocation, and coherency management of cache content. When space is required, instructions or data previously residing in the cache must be xe2x80x9cswappedxe2x80x9d out, usually on a xe2x80x9cleast-recently-usedxe2x80x9d (LRU) basis. Accordingly, if there is no room in the cache for additional instructions or data, then the information which has not been accessed for the longest period of time will be swapped out of the cache and replaced with the new information. In this manner, the most recently used information, which has the greatest likelihood of being again required, is available in the cache at any given time.
As noted, previous cache management techniques mostly depend on least-recently-used (LRU) algorithms in selecting a cache line victim for eviction and replacement. However, empirical measurements have shown that strict least-recently-used algorithms are unsatisfactory in many cases.
Various enhancements to LRU algorithms have been proposed or implemented in recent years, such as software managed LRU, pseudo-random influences, etc. Basic symmetric multi-processor snooping protocols have also been utilized to influence cache management.
Even with a cache memory management scheme, there are additional, related problems that can cause system performance to suffer. For example, in data processing systems with several levels of cache/memory storage, a great deal of shuttling of instructions and data between the various cache/memory levels occurs, which consumes system resources such as processor cycles and bus bandwidth which might otherwise be put to more productive processing use. The problem has been exacerbated in recent years by the growing disparity between processor speeds and the operational speeds of the different system components used to transfer information and instructions to the processor.
Additionally, only limited amounts of access history information is shared between horizontal processors/caches within conventional multiprocessor systems. Prior art system bus communication of access history is generally limited to coherency information (e.g., indicating in a snoop response that the snooping cache has the line in a shared or modified coherency state).
It would be desirable, therefore, to provide a system increasing the xe2x80x9cintelligencexe2x80x9d of cache management, and in particular to passing dynamic application sequence behavior information between processors and caches and utilizing that information to optimize cache management.
It is therefore one object of the present invention to provide an improved data processing system.
It is another object of the present invention to provide improved memory management in a data processing system.
It is yet another object of the present invention to provide improved cache memory management in a multiprocessor data processing system, which includes an improved system bus response protocol.
The foregoing objects are achieved as is now described. System bus snoopers within a multiprocessor system in which dynamic application sequence behavior information is maintained within cache directories append the dynamic application sequence behavior information for the target cache line to their snoop responses. The system controller, which may also maintain dynamic application sequence behavior information in a history directory, employs the available dynamic application sequence behavior information to append xe2x80x9chintsxe2x80x9d to the combined response, appends the concatenated dynamic application sequence behavior information to the combined response, or both. Either the hints or the dynamic application sequence behavior information may be employed by the bus master and other snoopers in cache management.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.