In today's computing environments, data storage systems that include large arrays of disk storage can be connected to one or a number of host processing systems. These data storage systems may store very large amounts, for example terabytes, of data. It is important that this data not be compromised, and that it remains accessible to applications running on the host processing systems, because if an event prevents a host system from accessing the data on the storage system, the applications running on the host system can cease to function. This can have devastating consequences where the application is, for instance, a database application supporting a financial institution. Various mechanisms are therefore employed in data storage systems to prevent loss or inaccessibility of data.
Many storage systems include caches, usually consisting of volatile RAM, for temporarily storing blocks of disk data. The data stored in the cache is more quickly accessible by the host systems, thereby providing significant performance gains. In a particular caching implementation known as “write-back” caching, when the host system writes data to the storage system, the storage system writes the data to the cache but does not immediately write the data to the disk array. The data remains accessible in the cache, and is not written back to the disk array until the cache location is needed for storage of a different block of disk data. Performance advantages are achieved by reducing the number of relatively time consuming accesses to the disk array.
Such write-back caching systems are vulnerable in a failure situation, since data that has been written to the cache may not yet be stored on the disk array. If the cache system fails, the cached data may be lost. Upon recovery, the contents of the disk array will not reflect the most recent application updates. This can cause an application to malfunction or fail. Mechanisms have been developed in order to avoid this problem. For example, many storage systems contain uninterruptible power supplies or batteries that allow these systems to continue to operate for a limited period of time after a line power failure. Upon notification of an impending failure, dirty data resident in the cache is “flushed” to the disk array—that is, all dirty data is written back to the disk array. During the time that dirty data is being flushed to the disk array, all I/O activity to and from the host system is halted in order that the flush can be completed.
The halting of I/O activity to and from the host system is disadvantageous in that it can cause applications executing on the host systems to fail. This is particularly deleterious in situations where the line power recovers before back-up power runs out. It is very often the case that line power is lost for only a few seconds. Thus, the currently known cache flushing mechanisms can needlessly cause applications to fail. It would be advantageous to provide a cache flushing mechanism that avoids these shortcomings.