A storage system environment may include one or more storage systems (for storing server data on storage devices) and multiple server systems accessing each storage system. Each storage system includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the storage devices. Each file may comprise a set of data blocks, whereas each directory may be implemented as a specially-formatted file in which information about other files and directories are stored.
The storage operating system generally refers to the computer-executable code operable on a storage system that manages data access and access requests (read or write requests requiring input/output operations) and may implement file system semantics in implementations involving storage systems. In this sense, the Data ONTAP® storage operating system, available from NetApp, Inc. of Sunnyvale, Calif., which implements a Write Anywhere File Layout (WAFL®) file system, is an example of such a storage operating system implemented as a microkernel within an overall protocol stack and associated storage. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
A storage system is typically implemented as one or more storage volumes that comprise physical storage devices, defining an overall logical arrangement of storage space. Available storage system implementations can serve a large number of discrete volumes. A storage volume is “loaded” in the storage system by copying the logical organization of the volume's files, data, and directories, into the storage system's memory. Once a volume has been loaded in memory, the volume may be “mounted” by one or more users, applications, devices, and the like, that are permitted to access its contents and navigate its namespace.
A storage system may be configured to allow server systems to access its contents, for example, to read or write data to the storage system. A server system may execute an application that “connects” to the storage system over a computer network, such as a shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. The application executing on the server system may send an access request (read or write request) to the storage system for accessing particular data stored on the storage system.
To provide support capabilities, a server or storage system may periodically produce and store logs/files of system events (“system logs/files”). A server or storage system may also collect and store (to the system logs) particular system information that is important for support purposes, such as maintenance or debugging/correcting errors of the server or storage system. For example, a system log may contain a list of important system activities/events, system configuration information, and system performance data. The system logs may then be analyzed/processed by administrators or program applications (e.g., debuggers) for providing support to the server or storage system (e.g., for maintenance or debugging).
In a storage system environment, the system logs generated by multiple server and storage systems may also be transferred and stored to a central storage location (“central repository”). For example, each server system may transmit its system logs to a particular storage system that acts as the central repository. The system logs of the server and storage systems may then be stored and further analyzed/processed at the central location for providing support to the server and storage systems and for detecting support issues across multiple server and storage systems.
Typically, however, the amount of storage space needed to store the system logs may consume significant storage resources. The use of storage resources may also be compounded by increasing number of server or storage systems used in the storage system environment, increasing quantities of system events, configuration, and performance data stored to a system log needed for providing effective support, and the long periods of time the system logs may be kept.
Data compression techniques may be implemented to reduce the storage space required by the system logs. Although data compression may reduce the storage size of a file, compression may also increase the access time to the file (since the file must first be decompressed). Compression may be appropriate for system logs, however, since each individual system log is typically rarely accessed and the increase in access time may be justified by the storage space savings. For example, the system logs are typically compressed individually. However, such compression may not provide sufficient storage savings since compression techniques typically produce greater compression ratios when a file has many repeating elements, and an individual system log/file may not contain enough repeating elements to produce a high compression ratio. The ratio or degree of compression is typically determined by comparing the original data size and the compressed data size, a higher ratio or degree of compression indicating a lower compressed data size given the same original data size.
System logs may also be grouped and then compressed as a group. However, system logs are typically grouped together arbitrarily (e.g., grouping random system logs, grouping system logs in order received, etc.). Since the system logs are grouped arbitrarily, the degree of compression may still not provide sufficient storage savings since the arbitrary grouping of system logs may still not contain enough repeating elements to produce a high compression ratio. As such, there is a need for a method for compressing system logs in a more efficient manner.