This invention relates generally to managing and analyzing data in a data storage environment, and more particularly to a system and method for determining workload characteristics including the profiles for such characteristics for one or more applications operating in a data storage environment.
Computer systems are constantly improving in terms of speed, reliability, and processing capability. As is known in the art, computer systems which process and store large amounts of data typically include a one or more processors in communication with a shared data storage system in which the data is stored. The data storage system may include one or more storage devices, usually of a fairly robust nature and useful for storage spanning various temporal requirements, e.g. disk drives. The one or more processors perform their respective operations using the storage system. Mass storage systems particular those of the disk array type have centralized data as a hub of operations all driving down costs. But performance demands placed on such mass storage have increased and continue to do so.
Design objective for mass storage systems include cost, performance, and availability. Objectives typically include are a low cost per megabyte, a high I/O performance, and high data availability. Availability is measured by the ability to access data. Often such data availability is provided by use of redundancy such as well-known mirroring techniques.
One problem encountered in the implementation of disk array data storage systems concerns optimizing the storage capacity while maintaining the desired availability and reliability of the data through redundancy. It is important to allocate as closely as possible the right amount of storage capacity with going over or under significantly because of cost and necessity but this is a complex task. It has required great deal of skill and knowledge about computers, software applications such as databases, and the very specialized field of data storage. Such requisite abilities have long been expensive and difficult to access. There remains and probably will be an increasing demand for and corresponding scarcity of such skilled people.
Determining the size and number of disk array or other data storage system needed by a customer requires information about both space, traffic and a desired quality of service. It is not sufficient to size a solution simply based on the perceived quantity of capacity desired, such as the number of terabytes believed to be adequate.
There is a long-felt need for a computer-based tool that would allow a straightforward non-complex way to allocate proper storage capacity while balancing cost, growth plans, workload, and performance requirements. This would be advancement in the computer arts with particular relevance in the field of data storage.
Another problem that exists is the need for an automated tool that is capable of building a highly granulated graph or profile of workload data collected from work on a storage system, such as IO or response time data. Although workload data may be collected by prior art systems such as the ECC Workload Analyzer available from EMC Corporation of Hopkinton, the ability to particularly identify information related to variables of interest is not available on automated systems in the art. It would be an advantage of a such highly resolved profile information could be either used separately or combined with the computer-based tool for allocating capacity as described above.
For example, given a data storage environment wherein several hundred storage devices, e.g. hard disk drives operate in conjunction with a storage array such as the EMC Symmetrix or EMC Clariion the IO workload generated is highly complex and difficult to analyze. It would be advantageous if the workload could be used to sort data volumes or logical devices according to which devices contain data being used for the work. If such a sorting action could be used further to sort such devices into groups or families of devices having similar work characteristics or being used by similar or identical software applications this would be a further advantage. But since the date being used to create a workload is distributed across many disks it is complex to sort out such information and so no tool in the prior art is capable of making such a determination. Nevertheless, it would clearly be advancement in the computer arts and a satisfaction of a long-felt need if such a tool were available.
Further, if such a tool could identify how many business applications are active as well as which devices have data used by such applications it would be useful and advantageous. Further if the tool could do these on a relatively automated basis, such that a high-degree of computer expertise was not needed to use such a tool this would also be a significant advancement in the computer arts.
To overcome the problems described above and to provide the advantages also described above, the present invention is a system and method for using work related data for a storage management function. In one embodiment the method uses a dataset on which work is performed, wherein the dataset represents data stored on one or more logical devices that are part of a data storage environment. This method embodiment includes the step of analyzing work performed on the dataset to determine a correlation between at least two logical devices, and using the correlation to perform a storage management function.
In an embodiment of the system a computer with display and memory are configured with computer-executable program logic capable of performing steps of analyzing work-related data for creating a correlation of logical devices and then using the correlation to perform a storage management function.
In another embodiment, a program product includes a computer-readable medium having code included on the medium configured to carry out computer-executed steps of analyzing work-related data for creating a correlation of logical devices and then using the correlation to perform a storage management function.