1. Technical Field
This invention generally relates to computer systems, and more particularly to techniques for data storage.
2. Description of Related Art
Computer systems may include different resources that may be used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as the Symmetrix(trademark) family of data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more host processors and provide storage services to each host processor. An example data storage system may include one or more data storage devices, such as those of the Symmetrix(trademark) family, that are connected together and may be used to provide common data storage for one or more host processors in a computer system. An example of operation and management of a data storage system is the Symmetrix data storage system as described in U.S. Pat. No. 5,819,310, Vishlitzky et al., entitled xe2x80x9cMethod and Apparatus for Reading Data from Mirrored Logical Volumes on Physical Drivesxe2x80x9d, issued Oct. 6, 1998, which is herein incorporated by reference, U.S. Pat. No. 5,592,432, entitled xe2x80x9cCache Management System Using Time Stamping for Replacement Queuexe2x80x9d, issued Jan. 7, 1997, Vishlitzky et al., which is herein incorporated by reference, and U.S. Pat. No. 5,381,539, issued on Jan. 10, 1995, entitled xe2x80x9cSystem and Method for Dynamically Controlling Cache Managementxe2x80x9d, Yanai et al., which is herein incorporated by reference, all of which are assigned to EMC Corporation of Hopkinton, Mass.
There may be a need within a computer system to collect statistics and other data. For example, in connection with performing data operations within a data storage system, an optimizer may be used to perform certain optimizations, such as those in connection with increasing disk performance of devices with the data storage system. Such optimizations may include, for example, performing logical device swapping. Data collection of statistics, such as measurements related to device performance, may be obtained and utilized by the optimizer. For example, statistics may be gathered regarding data storage operations, such as read and write operations, for devices in a data storage system.
The amount of data in the form of statistics collected over a predetermined time period may be a large amount, for example, in systems that have a large number of physical and/or logical devices. The amount of data may increase as the number of devices increases.
Efficient techniques may be used in connection with storage of the statistics, such as those used by the optimizer since there may be a large amount of data. Additionally, the optimizer may be invoked frequently to perform optimization determinations using these statistics. Thus, efficient retrieval techniques may also be used in connection with accessing the statistics.
One technique may store all the statistics in virtual memory. This may provide the advantage of ease of implementation. Additionally, the memory management may be performed by the underlying operating system, for example, in connection with memory page swapping operations used with virtual memory management systems. However, this technique of storing all the statistics in memory may exceed the amount of virtual memory as the amount of statistics collected increases. Additionally, as the amount of statistics increases, relying on the virtual memory management techniques of a system may result in performance decreases due to the large amount of page swapping that may occur as data is accessed for use by the optimizer. The physical ordering of data may not match the logical ordering of requested data.
Thus, it may be desirous to have an efficient technique for use in storing large amounts of data, such as statistics, and additionally utilizing efficient retrieval techniques as the data may have to be accessed frequently that overcomes the disadvantages of prior techniques.
In accordance with principles of the invention is a method executed in a computer system for storing data. At least one data value is collected. A delta value is determined for each data value. The delta value represents a difference between a current data value and a previous data value. Each of the delta values is compressed producing a corresponding compressed delta value using a compression function that is monotonically increasing such that if a first delta value A is less than a second delta value B, a compressed value of A is also less than a compressed value of B. Each of the compressed delta values is stored.
In accordance with another aspect of the invention is a computer program product for storing data. The computer program product includes machine executable code for: collecting at least one data value; determining a delta value for each data value, said delta value representing a difference between a current data value and a previous data value; compressing each of said delta values producing a corresponding compressed delta value using a compression function that is monotonically increasing such that if a first delta value A is less than a second delta value B, a compressed value of A is also less than a compressed value of B; and storing each of said compressed delta values.