The present invention relates generally to the field of snapshot management for computer data storage software, and more particularly to snapshot management for software managing and controlling a large capacity disk type data storage device such as a shingled magnetic recording/high storage density storage system.
A file system defines rules for naming files and placing them on a storage device for storage and retrieval. File system functionality can be divided into two components: a user component and a storage component. The user component is responsible for managing files within directories, file path traversals and user access to the files. The storage component of the file system determines the physical locations where files are stored on the storage device.
In conventional storage systems, a file system snapshot is a record of the state of a storage device or file system at any given moment in time. The snapshot is a guide for restoring a file, a storage device or a file system in the event, for example, that the storage device fails. Typically, a snapshot is made essentially instantly, and is made available for use by other applications for purposes such as: (i) data protection; (ii) data analysis and reporting; and/or (iii) data replication.
In some snapshot implementations, (including copy-on-write snapshot implementations, which will be further discussed below), a snapshot is a record of the state of a file system including the date and time the snapshot was taken. After a snapshot is taken, if a file, or a portion of the file (herein called a data block, or a block) is to be updated, a new instance of the data block is created and stored at a physical address different from the original data block. The new instance becomes the active block and the previous instance now becomes an inactive block. The file system, while keeping the inactive data block intact, updates its pointers to reference the active data block. To a typical user, nothing appears to have changed. The inactive data block (sometimes called the snapshot data), which is no longer accessible to some software, remains on the storage device and can be re-activated by an operation to restore the file (or even the whole file system) to the state at which it existed at the time the snapshot was taken. The active data block is stored on the storage device at a physical address that may have a shorter or longer access time relative to the inactive data block, and, consequently, a user may experience a change in the response time of the file system when an access operation involves the updated file.
The active data block (sometimes herein variously referred to as the “latest data”, the “live data” or the “primary data”) continues to be available to applications without interruption, while the inactive data block: (i) is used to perform other functions on the data; (ii) enables improved application availability; (iii) enables faster recovery from failures or service interruptions; (iv) enables easier back up management of large volumes of data; (v) reduces exposure to data loss; (vi) virtually eliminates backup windows; and/or (vii) lowers total cost of ownership of a backup solution.
In a conventional storage system with “multi-tier” architecture, different categories of data are respectively stored on different “tiers” of the storage system, typically based on criteria such as: frequency of use; security requirements; data recovery requirements; and/or other access-related criteria. Examples of different storage tiers include: (i) SSD (solid state storage device); (ii) SAS (serial attached SCSI); (iii) nearline SAS, etc. Different tiers basically represent different classes or qualities of service. The tier classification can vary based on factors such as speed, cost etc.
On some disk-type storage devices, there is a linear speed ratio between tracks stored on an outer partition and tracks stored on an inner partition. The speed ratio typically is close to 5/3 (outer/inner). For example, a drive that is capable of 120 MB/sec data transfer speed with respect to data on the outer tracks might yield only 72 MB/sec data transfer speed with respect to data stored on the inner tracks.
In some conventional systems, “copy-on-write” (COW) is the underlying mechanism for disk storage snapshots. In some COW data storage systems, multiple versions (for example, a version corresponding to each successive instance of a snapshot followed by a write operation) of an inactive data block are retained, and accumulated since the time the data block first came into existence.
In a large disk environment (for example a shingled magnetic recording/high storage density or SMR storage system), tiered storage can be implemented wherein data with high access frequency are moved onto outer tracks (which can deliver higher read/write speed), and low frequency/archival data are moved onto inner tracks (which delivers comparatively lower read/write speed).