1. Field of the Invention
The present invention is directed generally to memory devices and, more particularly, to memory devices of the type used to provide cache memory in computer systems.
2. Description of the Background
In developing a computer system, it is known that modern processors are capable of operating at speeds which exceed the ability of memory devices to provide data and or instructions to the processor. To overcome such problems, and allow the processor to run at or near its clock speed, it is well known to provide cache memory. Cache memory is a small amount of memory, compared to the overall memory capacity of the system, which maintains the data or instructions which the computer is most likely to need. The cache memory is typically constructed of a faster, more expensive type of memory cell than is used for the bulk of the system memory requirements. Thus, there is a tradeoff between speed and cost. The more cache memory provided, the faster the system will be, and also the more costly.
A cache memory system is comprised of two memory components. A first memory component, which is the larger of the two components, is a memory which stores the data or instructions needed by the processor so that such information need not be retrieved from the main memory. A second memory component is referred to as the tag cache. The tag cache contains an address, or a portion of an address. When that address or portion of the address matches the address of data or instructions which the processor has requested, a "hit" is generated indicating that the information requested by the processor is in the cache memory system. Complicated systems have been developed to ensure the maximum number of hits.
A typical cache memory is implemented using a plurality of data RAMs and one or more tag RAMs. Generally, the depth of the tag RAM is one quarter of the depth of the data RAM. In most cache implementations since late 1994, the individual data RAMs have been wider devices than the tag RAMs. That results in a large difference in device sizes between the tag RAM and the data RAM. For example, a 256K byte cache for an Intel Pentium.RTM. processor requires two 32K.times.32 SRAMs and one tag RAM which can range in size from 8K.times.8 to 8K.times.16 depending upon the chosen implementation, i.e., the maximum amount of main memory to be supported in the system. The result of that example is that the data RAMs have a density of one M byte whereas the tag RAM has a density which ranges from 64K byte to 128K bytes. The ratio in density of the cache data RAM divided by the cache tag ram is a factor of 16 or 8, respectively.
At any one time, there exists an optimum memory density with regard to manufacturing costs. That density becomes the highest volume available device at that time. When the one M byte RAM is the most prevalent device for use as cache data RAMs, it will be because it is the most cost effective size of device to manufacture. However, the 64K byte or 128K byte tag will still be needed. Manufacturing a separate 64K byte or 128 K byte tag memory will clearly be an inefficient use of process equipment if the 1 M byte RAM is the most cost effective device to manufacture.
Aspects of this problem have been addressed by integrating the entire tag memory and the entire data RAM into one memory chip. That results in an extremely wide memory device which is more costly to test. It also results in a device which has too large a density for the optimal manufacturing cost because the highest-volume cache size is determined by the most economic basic device. Thus, if two data RAMs and one tag RAM are combined into one device, it will be more than twice the density of the economic baseline solution. Device yield deteriorates rapidly as die size is increased beyond that "most economic" size.
Another way to address the problem is to incorporate the entire tag into one data RAM. That too is less efficient. It makes that RAM excessively wide, which increases test cost and die area. It requires that the tag be wasted in a system which needs multiple such devices in parallel. The latter problem can be solved by using one device which has the tag incorporated in parallel with another device which does not have the tag is incorporated. However, that then results in two different silicon implementations and the accompanying overhead costs of design, qualification, etc. Thus, the need exists for a cost effective approach to providing tag cache devices which makes maximum use of currently available fabrication processes while at the same time eliminating the need to design, verify, and qualify a separate device.