It is well known to provide a cache externally of the processor unit of a computer in order to improve the system performance by reducing the effective access time to data stored by mass-storage peripherals (typically hard disks).
The basic idea of cache is to store part of the data of a mass-storage device in a much faster memory (cache memory). If the needed data is already in the cache (cache `hit`) then the access time will be reduced resulting in a boost to system performance. If the data is not in the cache memory (cache `miss`) then there is no improvement at this time but the cache memory is updated with the new data so that is the next time the data is requested a cache `hit` will result.
By way of example, consider the case of a computer having a 512 megabyte mass storage device with the access time for one data block (1K bytes) of 10 ms, and a 4 megabyte cache memory with an access time for one data block of 10 microsecond. If the requested data is always in the cache memory (cache hit rate=100%) the access performance will be boosted by a factor 1000 by the cache memory. More realistically, the cache hit rate will be around 50% in which case performance will be boosted by a factor of 2.
Whilst this example shows clearly the benefit of a cache sub-system, the degree of this benefit is very difficult to estimate as it depends on a lot of parameters:
1- operating-system, PA1 2- software applications, PA1 3- file allocation on the mass-storage disk, PA1 4- size and architecture of the cache, PA1 5- bandwith of the different buses, PA1 6- overhead due to cache management. PA1 if no, (read miss) the request is propagated to the peripheral controller 17 which returns the requested block from the mass storage peripheral 16 to the cache memory. The cache manager 20 then copies the data block to a location in system memory specified by the operating system. PA1 if yes (read hit) the request is not transmitted to the peripheral controller 17; instead the cache manager 20 copies the data block from cache memory 19 to the specified location in system memory. PA1 if no, (write miss) the request is propagated to the peripheral controller 17 and the data block is thereafter transferred to the mass storage peripheral 16. PA1 if yes, (write hit) the request is not transmitted to the peripheral controller 17; instead the data block in the cache memory 19 is updated. PA1 if no, (read miss) the request data block is transferred from the mass storage peripheral 16 to the cache memory 25 and to the system memory 11 PA1 if yes, (read hit) the cache controller 26 transmits the data block from cache memory 25 to the system memory 11. PA1 if no, (write miss) the data block is transmitted to the peripheral 16 from the system memory 11. PA1 if yes (write hit) the data block in the cache memory is updated from system memory 11. PA1 association means associating addressable locations of the cache memory, as identified by respective identifiers, with any selected data blocks currently stored therein, PA1 check means responsive to a request to check the association means to find whether said particular data block is currently in the cache memory, and PA1 report means operative upon the check means finding that said particular data block is in the cache memory, to return an in-cache indication over the bus system to the program.
In general terms, parameters 1 to 4 determine the cache hit rate whereas paramaters 5 and 6 determine the cache efficiency at a given hit rate.
Other types of caching are also used to improve system performance such as internal processor caches; however, the present invention is concerned with cache sub-systems external to the processor unit. Two known forms of external cache system will now be briefly described with reference to FIGS. 1 and 2 of the accompanying drawings.
FIG. 1 illustrates a known computer system comprising a system processor 10 with working or system memory 11 from which the processor 10 executes programs including its operating system 18. The processor 10 is connected to the system memory 11 over a processor local bus 12 and a memory bus 13, these buses being interconnected through an inter-bus interface 14. A peripheral bus 15 is also connected to the interface 14 in a manner enabling its communication with both other buses. A mass storage peripheral device 16 is connected to the peripheral bus 15 via a peripheral controller 17.
In the FIG. 1 system, a software-based cache sub-system is provided. More particularly, a portion 19 of the system memory 11 is used as cache memory and cache manager software 20 permanently loaded into system memory is used to control and manage the cache memory. The size of the cache memory 11 can be changed without any hardware modification and, indeed, cache size can be changed dynamically.
The cache manager software is called by the operating system 18 whenever the latter wishes to read or write a block of data, and it is the processor 10 that actually executes the cache manager software 20. The basic operation of the FIG. 1 cache sub-system is as follows:
a) Read access:
The operating system 18 sends a request to the cache manager software 20 indicating that it wishes to read one block of data. The cache manager 20 then checks if the requested block is in the cache memory 19:
b) Write access:
The operating system 18 sends a request to the cache manager 20 indicating that it wishes to write one block of data. The cache manager 20 then checks if the requested block is in the cache memory 19:
Thus, data flow is either from system memory to system memory in case of a cache hit, or between the peripheral and system memory in case of a cache miss. As the system memory bus often has the highest bandwidth, cache hit transfers have the best performance. Note that if peripheral controller 17 has a DMA (MASTER) capability, then the transfer between the peripheral and system memory can be done by the controller thereby relieving the processor of this task; however the processor still performs the cache management.
Advantages of the FIG. 1 caching arrangement are that it has no hardware impact on the system, it is a cost free solution (no added hardware), the cache size is flexible, and the software cache manager is generally provided with the operating system. Disadvantages include the fact that using software to carry out cache management has a performance impact; indeed, some cache architectures that provide the best hit rate (for example, full associative cache with a great number of cache entries) may introduce unacceptable management overhead.
FIG. 2 shows another known cache arrangement as applied to the same computer system as FIG. 1. In the FIG. 2 arrangement, the cache management is not performed by the system processor 10 but by a dedicated hardware cache controller 26 that is part of the peripheral controller. The cache memory 25 is also dedicated to cache operation and is physically separated from the system memory 11. The basic operation of the FIG. 2 cache arrangement is as follows:
a) Read access:
The operating system 18 sends a request to the peripheral controller 17 indicating that it wishes to read one block of data. The cache controller 26 check if the block is in the cache memory 25:
b) Write access:
The operating system 18 sends a request to the peripheral controller 17 indicating that it wishes to write one block of data. The cache controller 26 check if the block is in the cache memory 25:
Thus, in all cases, the data flow goes through the peripheral bus and even if this bus has a high bandwidth (like the PCI bus) it may become a bottleneck because it is often shared with other peripherals (for example, a typical system has VIDEO, IDE, SCSI and LAN interfaces on the same bus).
Advantages of the FIG. 2 cache arrangement are that there is no system software overhead (the cache management is performed by a dedicated hardware); high cache efficiency (unlike the FIG. 1 arrangement, the cache organisation is not limited); and there is no need for drivers, making the arrangement operating system independent. Disadvantages include the fact that the cache size is not flexible, the cost is high (dedicated DRAM and DRAM controller), and all cache transfers use the peripheral bus which may be an issue for the other devices that share the same bus.
It is an object of the present invention to provide an external cache arrangement which minimises the disadvantages of the prior art systems and provide a reasonable compromise between low cost, efficiency and flexibility.