1. Field of the Invention
The invention relates to the field of flash cache architecture for storage system, and specifically to a method of management of cache memory.
2. Description of the Related Art
Solid state devices and disks (also known as SSDs) provide random Input/Output (I/O) performances and access latencies that are orders of magnitude better than that of rotating hard-disk drives (also known as HDDs). SSDs provide several advantages compared to HDDs: indeed, SSDs significantly reduce power consumption and dramatically improve robustness and shock resistance thanks to the absence of moving parts.
On the other side, SSDs have a much higher cost per GigaByte (GB) compared to HDDs, although the price of SSD is constantly going down. It is anticipated that in the coming years the price gap between SSDs and HDDs remains.
It is widely accepted that the most cost-efficient way to integrate flash into enterprise storage system is to put hot data on flash and cold data on disks. The term hot data refers to information used most frequently, and the term cold data refers to information which is used less frequently. In other words, one puts the hot data on the SSDs which are fastest hardware, and the cold data on HDDs which are less competitive. One way to achieve this is to use flash as a cache extension.
A cache is a fast piece of storage, used for copies of data that normally reside in a larger, slower piece of storage. The cache is used to speed up access to data resident in the slower storage.
In modern enterprise storage system, there is one cache controller (or sometime two for fail-over purpose) that manages the read and write caches for the whole system. The cache controller may be in the form of a powerful CPU together with Dynamic Random Access Memory (also known as DRAM) or battery-backed DRAM as data store. The cache controller further implements typical cache replacement algorithms such as LRU, ARC, or others.
Most SSDs use flash memories, notably because flash memories ensure data persistence. However, a flash memory does not support high-performance update in place, thus flash memory normally operates a relocate-on-write to boost write performance. In order to support relocate-on-write, flash management functions such as garbage collection and wear leveling have to be implemented by the system.
Several straightforward approaches exist to integrate flash as a cache into enterprise systems. A first one consists in using the flash memory as a raw storage media and implementing flash management functions using an existing cache controller. In practice, the existing cache controller manages the read and write cache of HDDs. Besides the cache replacement algorithm, the cache controller has to manage the flash media (e.g., SSDs) by maintaining metadata for all data chunks cached on flash memory. The metadata is essentially a table showing which host LBA-addressed (Logical Block Addressing-addressed) data chunk is stored on which flash page addressed by a physical block address (PBA). The cache controller has to manage this address mapping table, perform relocate-on-write by doing garbage collection or de-staging, which may consume a significant portion of computation resources. The major drawback of this architecture is that it does not scale well with the size of flash space.
Another straightforward approach is to use existing SSDs as cache extensions. In this architecture, flash memory is managed by a flash controller inside each SSD. The flash controller is transparent to the cache controller and dedicated to flash management functions such as garbage collection and wear leveling. The cache controller can view multiple flash SSDs as an integrated linear collections of pages for cache use (e.g., each page of 4 KB size). The pages are addressed by the cache controller using the LBAs of each SSDs. For cache management purposes, the cache controller maintains a metadata table, mapping a cached host LBA-addressed block to a flash LBA addressed block. One advantage of this architecture is that the cache controller does not need to manage flash memory thanks to the use of flash SSDs. However this architecture does not scale well because the cache controller has to address the entire flash LBA space offered by flash SSDs, which can be huge.