1. Field of the Invention
The invention relates to the field of data processing. More specifically, the invention relates to improving the access of high speed data storage devices, such as cache memories, in data processing systems.
2. Background Information
A cache, which is a relatively small, yet fast storage device, is typically utilized in data processing systems to store a limited quantity of data (e.g., instructions, data operands, etc.) that has recently been used and/or is likely to be used by a processor or other device that may access the cache. As such, a cache may greatly improve the latency associated with accessing higher levels of memory (e.g., main memory, hard disk, etc.). Each item of data that is stored in a data array of the cache typically has an associated xe2x80x9ctagxe2x80x9d value that is stored in a tag array. In several implementations, a memory address, or a portion thereof, is typically identified by a unique tag. Thus, when a read of a memory address, for example, is requested by a device (e.g., a processor, I/O bridge, other bus master, etc.), the memory address or a portion thereof is compared against one or more tags in the tag array of the cache to determine if the data corresponding to the memory address is stored in the data array of the cache.
Data in a cache may not always be consistent with data in another storage area (e.g., main memory, higher level cache, etc.). For example, a processor may copy requested data from main memory into a cache and modify the data in the cache (or cached data). Until main memory is updated with the modified cached data, main memory will contain xe2x80x9cstalexe2x80x9d data that is inconsistent with the modified data in the cache. In systems where more than one device may share storage devices (e.g., multi-processing systems having caches and shared-memory), cache/data coherency becomes an important consideration, since more than one device may have access to a shared memory. Thus, various techniques have been utilized to provide coherency between various copies of data that may be present in various storage devices, including caches and other storage devices, that may be shared or accessible by a particular device or set of devices.
FIG. 1A is a block diagram illustrating an exemplary prior art computer system employing cache memories and shared-memory devices. In FIG. 1A, a system 100 is shown which includes a system bus (or xe2x80x9cfrontside busxe2x80x9d) 110 connecting a processor 104, a memory 112, and a processor 114. The memory 112 represents a relatively slow, high level memory (e.g., main memory, hard disk, etc.) that is shared by the processor 104 and the processor 114.
The processor 104 includes an xe2x80x9con-chipxe2x80x9d L1 cache 102, and is further connected, via a dedicated or xe2x80x9cbacksidexe2x80x9d bus 106, to an L2 cache 108. In one implementation, the L1 cache 102 is smaller, yet faster than the L2 cache 108. Thus, the L1 cache 102 may further cache data from the L2 cache, which in turn may cache data from the memory 112. Similarly, the processor 114 is shown having an L1 cache 120, and is further connected, via a backside bus 130, to an L2 cache 122. As shown, the L2 cache 108 includes a tag array 116 and a data array 118, and similarly, the L2 cache 122 includes a tag array 124 and a data array 126. The tag arrays 116 and 124 may store a number of tags, each corresponding to cached data stored in a location in the data arrays 118 and 126, respectively.
Upon request of data (e.g., a read request) by the processor 104, for example, the L1 cache 102 may be accessed. If an L1 cache miss occurs (i.e., the requested data is not available in the L1 cache 102 ), the L2 cache 108 may then be accessed via the backside bus 106 to determine if the requested data is contained therein. Additionally, data in the L1 cache 102 or the L2 cache 108 may be modified by the processor 104. In a similar manner, the processor 114 may operate in conjunction with its L1 cache 120 and L2 cache 122.
Additionally, the L1 cache 102 may monitor or xe2x80x9csnoopxe2x80x9d the system bus 110 to determine if data being requested or modified by a transaction on the system bus 110 (e.g., by the processor 114 or other device connected to the system bus 110) is stored in the L1 cache 102. Similarly, the L2 cache 108 may snoop, through the backside bus 106 and the processor 104, the system bus 110. For example, the processor 104 may include logic to control snoop operations by the L2 cache 108.
From the above description, it is apparent that the processor 114 or other requesting agent must monitor the system bus 110 to receive a snoop result from the L1 cache 102 and L2 cache 108 before completing a read and/or write request of the shared memory 112. However, a number of circumstances may delay the completion of a snoop operation of the L1 cache or the L2 cache 108. For example, the backside bus 106 may be occupied with a transaction between the processor 104 and the L2 cache 108, which may delay the snoop of the L2 cache 108. Furthermore, a relatively substantial delay may be incurred while awaiting snoop results of the L2 cache 108 through the processor 104 and the backside bus 106. Accordingly, the overall delay associated with obtaining snoop results first from the L1 cache and then from the L2 cache 108 through the processor 104 and backside bus 106 may be relatively substantial.
FIG. 1B is a block diagram illustrating an alternative implementation of the exemplary prior art computer system employing cache memories and shared-memory devices described with reference to FIG. 1A. In the system 150 shown in FIG. 1B, the L2 caches 108 and 122 are connected to the system bus 110, while the processors 104 and 114 are connected, via the backside bus 106 and the backside bus 130, respectively, to the L2 caches 108 and 122, respectively.
As previously described with reference to the system 100 of FIG. 1A, the backside bus 106 may be occupied with a transaction between the processor 104 and the L2 cache 108, which transaction could delay the snoop of the L1 cache 102 through the backside bus 106 and the L2 cache 108. Furthermore, L1 cache 102 is limited to perform a snoop and/or post snoop results on the system bus 110 xe2x80x9cthroughxe2x80x9d the L2 cache 108, when the L2 cache 108 is not performing the same.
Thus, it is desirable to provide cache/data coherency in a system that may include multiple caches and requesting devices, while avoiding the above-described delays associated with prior art snooping schemes.
According to one aspect of the invention, a first device is coupled to a first bus and a second bus. Additionally, a tag array is coupled to the first bus and further coupled to the first device via the second bus.
According to yet another aspect of the invention, a method is provided for allowing access by a first storage area of a first device in response to activity on a first bus. Further, in response to activity on the first bus, a method is provided for allowing access by a second storage area of the first device concurrently with the access by the first storage area, wherein the second storage area is coupled to the first bus and is further coupled to the first device via a second bus.