1. Field of the Invention
The present invention relates to data storage and management methods and systems. More particularly, the present invention relates to methods and systems for hierarchical storage management, data management and arrangement into storage resources based upon a specific set of pre-selected parameters.
2. Related Art
As businesses expand in data volume and diversity, organizations must manage escalated physical quantity of data storage, demand for universal access to information, growing complexity of storage environments, and adoption of emerging technologies. Very few companies have unlimited resources or time to address these challenges. Today, companies have to consider new storage management strategies based on high performance, intelligent systems, and sophisticated software that enable the management of existing data and existing networks while maximizing uptime and reducing the cost of data storage.
Hierarchical storage management is a method for managing large amounts of data. Files/data are assigned to various storage media based on how soon or how frequently they will be needed. The main characteristics of data are evaluated by the storage system. Data is managed based on one or a plurality of those characteristics, such as time interval, frequency of use and/or value. The user's interest is also evaluated based on the same main characteristics. Data is managed according to users' interest during the data's lifecycle. Data can also be arranged into appropriate storage resources depending on storage costs.
The management of data during its lifecycle is a challenging task. The main challenge relies in how to manage very large volumes of data, that are increasing constantly, and at the same time to control the cost associated with data management while preserving very low Total Cost of Ownership (TCO).
The basic requirements for successful management of storage systems, that have been identified within the presently available technologies for managing and storing large volumes of data within the desired budget, are to posses fully scalable architectures and to provide data management services at minimal costs. The fully scalable architecture does not limit the capacity of storage systems and the management range performed by a data management software pertaining to a storage area network integrated within the hardware architecture. Minimal TCO can be achieved by performing minimal administration tasks.
Object Based Storage Devices (OSD) and Reliable Array of Independent Nodes (RAIN) are examples of storage system architectures that aim at fully scalable data management.
Minimal TCO was achieved, in a traditional way, by managing data storage via Hierarchical Storage Management (HSM) systems. HSM systems allow the management of data files among a variety of data storage media. One challenge the HSM systems faces is that involved media differ in access time, capacity, and cost such that they are hardly to be integratively managed. For example, short-term storage media, such as magnetic disks that can be arranged as a redundant array of independent disks (RAID), have different parameters from any other components within the network such that they need to be managed separately. HSMs provide an interim solution by providing automatic performance tuning for storage therefore eliminating performance bottlenecks. Currently, the technology behind HSM systems involves preserving the access frequency for each data volume and analyzing their access pattern. It also involves normalizing the access ratio to the storage subsystem by migrating logical volumes within the storage. One example of current HSM systems is CruiseControl® included in Hitachi Lightning 9900™ V product series, that are widely available today.
OSD and RAIN architectures are examples of fully scalable architectures which need additional technologies besides hierarchical storage data management to achieve and maintain minimal TCO in managing data. If a company regularly adds identical storage systems to expand storage capabilities (for example, online storage devices), as the data volume grows, very high costs are incurred due to the regular addition of storage capacities. As storage capacity rapidly reaches its limits, the company cannot minimize its TCO.
Another aspect to consider is that data has its own value, which varies through its lifecycle. There is a need for architectures containing different types of storage devices and media and managing data depending on its value and lifecycle. Data is stored in the appropriate places, depending on its values. It is important to provide a system which automatically defines where data should be stored, by considering both data values and storage costs.
The traditional HSM technologies do not take into consideration changes in data's value through its lifecycle. Currently, users define data lifecycle management proceedings statically, before archiving, and data is stored in different types of storage media based on predefined parameters. For example, when the predefined lifetime of certain stored data expires in a RAID system, the system simply archives the data into a tape. However, the value of data varying through its lifecycle also depends on the users' interest that varies from time to time. If users want to change the value of data during its lifecycle, they have to manage it manually and with additional management costs.
There is a need for methods and systems for hierarchical storage management that take into consideration the data's value based on users' interest through the data's lifecycle, and then arrange the data into appropriate storage resources based upon the data's value and storage costs.
There is also a need for methods and systems for hierarchical storage management that allow fully scalable architectures, such as OSD and RAIN, to manage data through their lifecycle with minimal TCOs.