In many conventional high demand storage systems where high volume business transaction data is stored, the storage logical address space is mapped so that the actual data is stored on multiple physical drives. In certain updating operations, the storage system accesses one physical drive while in other updating operations the system accesses another physical drive. Such a multiple drive configuration provides a substantial performance enhancement over systems having only one physical drive.
In many of such storage systems, write transactions are distributed on a substantially random basis. In such conventional storage systems, it is difficult to predict the location of data updates due to the widely varied nature of business transactions. In such systems, there is a natural tendency towards randomly or widely distributed write transactions. Accordingly, write transactions that occur are typically distributed throughout the storage space.
In certain applications such write transactions may be clustered in small areas such that they can be written efficiently. However, in a significant performance-impacting percentage of cases, such write transactions are not written efficiently.
Conventional large volume storage systems commonly utilize a redundant array of independent disks (RAID). In such a RAID array, multiple drives are logically coupled to provide a larger composite storage entity that exhibits better storage capacity, performance and reliability than would be provided by a single drive or a group of unrelated drives.
In accordance with the exemplary, illustrative embodiments of the present invention described below, the inefficiencies in prior art storage systems are addressed in part by continuously remapping where data is stored. The remapping is designed so that writing occurs on an optimum speed basis tuned to the storage system being utilized, e.g., the writes occur in substantially sequential disk storage locations to the extent possible or in other patterns or distributions determined to minimize the time or resources necessary to accomplish the write activity. Thus, by analogy, in an exemplary embodiment, writing to disk occurs sequentially as if, for example, it were a tape.
Using the methodology described herein with respect to the illustrative embodiments, the optimum speed with which the disk device is capable of writing may be approached relative to conventional storage systems. By remapping data on a substantially continuous basis as described herein, writes advantageously occur nearer the sequential performance limits of the particular drive being utilized when compared with conventional high volume business storage systems. By continuously remapping data, the typical time consuming seeking operation that occurs during conventional write transaction processing is minimized. Further, in accordance with the disclosed exemplary embodiments, a methodology is described for persistently maintaining in a relatively long term storage, in a consistent state suitable for recovery or other purposes, the mapping information.
The illustrative methods and apparatus described herein dramatically improve the performance of non-uniform access-time persistent storage media. The illustrative embodiments provide significant benefits to the write (i.e., update, modify) bandwidth or throughput in persistent storage environments. The preferred embodiments improve the external performance of a random access storage system, such as a SAN, NAS or RAID array, when the actual data storage is contained on media with a non-uniform access-time, such as the difference in maximum performance on a conventional disk in sequential access versus random access or even substantially non-sequential access. The illustrative embodiments can be implemented at many locations between, inclusively, the application and the physical storage media's onboard device logic. The preferred embodiment would utilize processing and short-term memory resources of a storage subsystem, such as the main processing systems used in external disk subsystems like NAS, SAN or RAID boxes. Additional embodiments provide dramatic improvements to specific applications, such as databases, business transaction systems, content management systems, by employing variations of the methods at the application and/or operating system level.
The illustrative embodiments achieve dramatic gains by introducing a continuous, clustered or burst-mode, optimized dynamic reorganization of the storage media, or application container files, that converts concurrent updates to one or more sequential streams, as appropriate to the number of physical devices and channels over which the updates can be dispersed, and thereby minimize the number of seek operations, or relatively more time-consuming accesses on non-disk media with non-uniform access time, necessary on each physical device. The methods include provisions to maintain logical consistency of the information in a wide variety of failure or recovery scenarios.
The illustrative methodology also provides the ability to incrementally and efficiently reorganize the physical storage media to optimize contiguous read activity. The reorganization process can be accomplished without loss of normal operation performance by utilizing otherwise idle time in the affected storage media.
The illustrative methodology described herein may be used in conjunction with traditional read caching and write caching approaches, improving the overall performance of the subsystem beyond what can be achieved with caching alone.