The slow mechanical nature of input/output (I/O) devices such as disks compared to the speed of electronic processing has made I/O a major bottleneck in computer systems. As the improvement in processor performance continues to far exceed the improvement in disk access time, the I/O bottleneck is becoming more significant. It is therefore necessary to find effective techniques to improve I/O performance. One possible approach for increasing effective disk performance is to find ways to reorganize data blocks on a disk based on the anticipation as to which data blocks are likely to be accessed together by the users. Typically, groups of data blocks are accessed together in a generally predictable manner. Thus, data blocks that are accessed contemporaneously might be laid out close together on the disk so that the delay associated with moving the read/write head is minimized.
Previous attempts at block reorganization have concentrated on identifying data items (data blocks, disk cylinders, or data files) that are accessed frequently, i.e., hot, and then packing these items together based on their frequency of access (referred to as heat) so that as much heat is clustered into as a small storage region as possible. See, for example, U.S. Pat. No. 5,765,204, “Method and Apparatus for Adaptive Localization of Frequently Accessed, Randomly Addressed Data.”
FIG. 1 illustrates the data blocks 10 on a disk and their typical frequency of access without being reorganized. The rows of squares in FIG. 1 represent the data blocks laid out on the disk, where the block at the extreme right of the first row is located immediately before the block at the extreme left of the second row and so on. The dark blocks, like block 11, are most frequently accessed by the users. The dotted blocks, like block 13, are next frequently accessed by the users, but less so than the dark blocks. The slashed blocks, like block 12, are also frequently accessed by the users, but less than the dotted blocks.
FIG. 2 illustrates a prior art reorganization of the data blocks in which the data blocks are laid out in a reorganized region 23 in a serial fashion in the order of their access frequency. The most frequently accessed data blocks 20 are laid next to each other on the disk to minimize the distance the read/write head must travel to access the blocks. The next groups of frequently accessed blocks 21 and 22 are also grouped together as shown. The access to blocks 21 is less frequent than blocks 20, and the access to blocks 22 is less frequent than blocks 21.
FIG. 3 illustrates a prior art reorganization of the data blocks in an organ-pipe fashion. The most frequently accessed blocks 30 are laid out at the center of the reorganization region. The next frequently accessed blocks are laid out on each side of blocks 30 as blocks 31 and blocks 32. The even less frequently accessed blocks 33 and 34 are at the ends of the reorganization region, as shown.
The problem with these prior art approaches is that contiguous data that used to be accessed together could be split up. More important, the access sequence typically exhibits some spatial locality even before the blocks are reorganized. Once the aggressive read-ahead or sequential prefetch commonly performed by the disk today is taken into account, the previously proposed reorganization techniques are seen to reduce seek distance at the far greater cost of rendering the prefetch ineffective.
FIG. 4 illustrates another prior art reorganization of the data blocks by laying out the identified hot data in increasing order of their original address, i.e., a sequential layout. The blocks 40-42, which have different access frequency, are reorganized according to their sequential addresses. See, for example, “Adaptive Block Rearrangement,” Akyurek et al., ACM Transactions on Computer Systems, Vol. 13, No. 2, pages 89-121, May 1995. The problem with this technique is that the result is sensitive to the original block layout, especially to user/administrator actions such as the order in which workloads are migrated or loaded onto the disk.
More recently, the idea of packing data blocks that are likely to be used together into a larger superunit, have been investigated by Matthews et al. in “Improving The Performance of Log-Structured File Systems With Adaptive Methods,” Sixteenth ACM Symposium on Operating System Principles (SOSP '97), 1997. In this study, the superunits are not ordered nor are the blocks within each superunit. Without ordering the data blocks, the effect of such clustering is merely to move related blocks close together to reduce the seek distance.
The above-mentioned prior art focuses mainly on reducing only the seek distance. This is not very effective at improving disk performance since it does not affect rotational latency, which constitutes about half of the disk access time. Moreover, any seek, regardless of distance, is a costly operation because of inertia and head settling time. With faster and smaller-diameter disks, the time difference between a short seek and a long seek is further diminished.
Others have also considered laying out blocks in the sequence that they are likely to be used. See, for example, the “Intel Application Launch Accelerator” by Intel Corporation, http://www.intel.com/ial/ala. However, this accelerator relies on external knowledge to identify patterns that are likely to be repeated, requiring for instance, operating system support or software vendors to preoptimize their applications. It does not automatically detect repeated sequences from the access sequence of a real workload.
There has also been recent work on identifying blocks or files that are accessed together so that the next time a context is recognized, the files and blocks can be prefetched accordingly. An example of this work is described by Kroeger et al. in “Predicting File System Actions From Prior Events,” Proceedings of the USENIX 1996 Annual Technical Conference, pages 319-328, January 1996. The effectiveness of this approach, however, is constrained by the amount of locality that is present in the request stream, by the fact that it does not improve fetch efficiency and by the tendency for I/O requests to arrive together, which makes it difficult to prefetch in time.
Various heuristics have also been used to lay out data on disk so that items (e.g., files) that are expected to be used contemporaneously are located close to each other. The shortcoming of these techniques is that they are based on static information such as name space relationships of files, which may not reflect the actual access behavior. Furthermore, files become fragmented over time. The blocks belonging to individual files can be gathered and laid out contiguously in a process known as defragmentation as described by McDonald et al. in “Dynamically Restructuring Disk Space For Improved File System Performance,” Technical Report 88-14, Dept. of Computational Science, University of Saskatchewan, Saskatchewan, Canada, July 1988. But defragmentation does not handle inter-file access patterns and its effectiveness is limited by the file size which tends to be small. Moreover, defragmentation assumes that blocks belonging to the same file tend to be accessed together which may not be true for large files or database tables, and during an application launch when many seeks remain even after defragmentation.
Therefore, there remains a need for a storage system and method for reorganizing data to effectively increase performance without the above-described disadvantages.