The present invention relates to management of a multi-tier storage environment, and more specifically, this invention relates to efficient management of high performance tiers in a multi-tier storage environment.
A file system defines how files are named and manages how they are placed for storage and retrieval. File system functionality may be divided into two components: a user component and a storage component. The user component is responsible for managing files within directories, file path traversals, and user access to files. The storage component of the file system determines how files are stored physically on the storage device.
File blocks are mapped to logical blocks, which are then mapped onto actual physical blocks on storage media. A logical to physical mapping layer is used to make file management independent of storage management. A file system 102 is shown in FIG. 1, where File 1 has two file blocks 112: FBlock 0 (FB0) and Fblock 1 (FB1). FB0 and FBI for File 1 are mapped to two logical blocks 110: LBlock 0 and LBlock 1. LBlock 0 and LBlock 1 are mapped to actual physical blocks 108 (Block 0 and Block 10) on the storage medium 104. For File 2, file blocks 112 FB0, FB1, and FB2 are mapped to LBlock 2, LBlock 3, and LBlock 4, which are mapped to actual physical blocks 108 (Block 30, Block 50, and Block 60) on the storage medium 104. Since storage medium 104 (such as hard disk drive (HDD), magnetic tape, etc.) accesses are slower, data blocks are stored to the in-memory cache 106 for quicker access. On a first read operation, data is copied from the storage medium 104 to the in-memory cache 106, in an action referred to as a “Cache Miss.” Subsequent accesses on the block are performed from the in-memory cache 106 once the desired data is stored therein. Blocks from the in-memory cache 106 are written to the storage medium 104 in either of two scenarios: 1) in-memory cache 106 space is limited, so when new blocks are to be stored to the in-memory cache 106, old blocks are evicted from the in-memory cache 106 and stored on the storage medium 104 in an action referred to as a “Cache Eviction:” and 2) when an application explicitly commands the in-memory cache 106 to flush data to the storage medium 104.
Multi-tiered storage is a storage method where data is stored on various types of storage devices primarily based on criteria of the access, frequency of use, security, and/or data recovery requirements. For example, data that is frequently accessed by an application that is response time sensitive might be stored on a solid state drive (SSD). Other data that is infrequently accessed and for which a higher response time is more tolerable might be stored on high capacity 7200 RPM HDDs. The cost per Gigabyte of storage is much higher for SSDs than it is for the 7200 RPM HDDs. One challenge in effectively using multi-tiered storage is identifying the data that benefits from the higher cost/higher performance storage tiers. Over time, the optimal tier for a given piece of data may change; thus, the identification and movement of data to an appropriate tier is an ongoing and evolving process.
Since SSDs are costlier than HDDs, preferred solutions allow for dynamic relocation of data across tiers based on the data usage by placing “hot” data with high I/O density and low response time requirements on SSDs while targeting HDDs or other slower-responding data storage devices for “cooler” data that is accessed more sequentially and/or at lower rates.