Computer systems are capable of storing vast amounts of data and information. To store such information, a typical computer system maintains a logical data structure called a file system that serves as an index of individually accessible data repositories commonly known as files. A typical file system may provide a hierarchical arrangement of directories, sometimes referred to as folders, and each directory or folder is able to maintain the identity of individual files stored within that directory or folder.
Each file in a directory of a file system may have a file name, as well as other metadata associated with that file, such as the size (e.g., in bytes) of the file, an identify of an owner or creator or the file, a set of access permissions associated with the file, and so forth. The actual data associated with the file is said to be stored “within” the file, but is actually data that is encoded at various locations with one or more storage devices such as an optical or magnetic disk drives, or within memory in the computer system. For each file listed in a file system, the file system maintains a pointer, sometimes called a “file handle,” that identifies a specific location within a data storage system that identifies the beginning of the data for that file. When a program executing on the computer system needs to access (e.g., read or write) data to or from that file, the program can make a sequence of system calls to an operating system executing in the computer in order to, for example, open the file and read or write data to the file. As a program reads or writes data to or from the file, the file system, which is often part of the operating system, tracks the current location in the file where data was last written to or read from for each file. Operating system developers and third party application developers have created conventional tools to record, within an access log, all access events (e.g., opens, reads, writes, closes, etc.) to a selected group of one or more files.
Accordingly, computer system managers and software developers use such log file creation tools to produce access logs that record access events (e.g., a read) associated with stored data, such as accesses to one or more files stored on a disk. In a typical application, each access event recorded in a respective access log includes a start byte, a stop byte (or alternatively an offset byte value from the start byte), and a time stamp. Depending on the complexity of the log file creation tool, other information may be stored as well, such as the identity of the program performing the access. A range of bytes between the start byte and the end byte indicates what portion of the file was accessed at a given time as indicated by the time stamp. Such recorded information in a respective access log can be analyzed to determine patterns of accessing data from, for example, a disk and/or a cache.
Accordingly, computer system managers and software developers use such log file creation tools to produce access logs that record access events (e.g., a read) associated with stored data, such as accesses to one or more files stored on a disk. In a typical application, each access event recorded in a respective access log includes a start byte, a stop byte (or alternatively an offset byte value from the start byte), and a time stamp. Depending on the complexity of the log file creation tool, other information may be stored as well, such as the identity of the program performing the access. A range of bytes between the start byte and the end byte indicates what portion of the file was accessed at a given time as indicated by the time stamp. Such recorded information in a respective access log can be analyzed to determine patterns of accessing data from, for example, a disk and/or a cache.
One conventional method of analyzing access logs for the purpose of recognizing patterns is to use a text editor to open an access log and analyze which regions of a file have been accessed by different processes running on a computer. That is, according to one conventional method, a user can open an access log via a text editor to study and interpret textual data in the access log. Based on reading the textual data, the user can identify specific details such as when and where accesses typically occur in a file. Additionally, the user can identify different sizes associated with accesses to a disk.
One purpose of analyzing the access logs as briefly mentioned above is to identify access patterns associated with a disk and, more particularly, one or more files stored on a disk. Based on identified access patterns, the data stored in a respective file can be more efficiently stored on a disk. For example, based on data in the access log, if two regions of a disk are often accessed at nearly the same time, but they are always retrieved one after the other, a user may decide to reconfigure how the file is stored to disk so that the first range of data and the second range of data in the file are stored relatively close to each other on disk. Storing the first and second range of data closer to each other on a disk reduces the amount of distance (and therefore time) required for a mechanical head reading from the disk to jump from one region to another when reading the first region and second region, one after the other.