This section is intended to introduce the reader to various aspects of the art that may be related to various aspects of the present invention. The following discussion is intended to provide information to facilitate a better understanding of the present invention. Accordingly, it should be understood that statements in the following discussion are to be read in this light, and not as admissions of prior art.
Today, there are many types of persistent storage used by network attached storage servers, including magnetic disk storage, solid state storage, and battery-backed RAM. This type of storage can be used by a typical NAS or SAN server, storing all of the data in a virtual disk or file system, or it can be used in a tier or cache server that stores only the most recently accessed data form a disk or file system.
In either type of storage system, storing data in the best type of persistent storage for its reference pattern can result in a much better ratio of storage system cost per storage system operation. For example, NVRAM provides the fastest random read or write rates of the tree example storage media mentioned above, but it also currently the most expensive, perhaps five times as expensive as the next most expensive media (flash or solid state storage). Flash storage provides comparable random read performance to NVRAM at a small fraction of the cost, but getting good random write performance from flash storage is a challenge, and also negatively affects the overall lifetime of the flash device. Standard magnetic disk storage handles sequential read and write requests nearly as fast as any other persistent storage media, at the lowest cost of all, but loses the vast majority of its performance if the read or write requests are not for sequentially stored data.
Thus, if a storage system can place the various types of data in the appropriate type of storage, a storage system can deliver a much better price/performance ratio than one that simply uses a single type of persistent storage.
Existing systems make use of a mix of types of persistent storage in a number of ways. Many file servers, going back to Sun Microsystems' PrestoServe board for its SunOS-based file servers, have used NVRAM to reduce write latencies by providing temporary persistent storage for new incoming data. In the SSD arena, NetApp's PAM2 card is a victim cache made from SSD holding data that doesn't fit it memory, speeding up random reads to the data stored in the card. For a number of reasons, even thought this cache is made from persistent storage, the NetApp PAM2 cache does not hold modified data that is not persistently held elsewhere, either in an NVRAM card or on rotating disks. And of course, an obvious use of SSD drives 48 is as a replacement for existing drives, providing faster read access, especially random read access, at the cost of some penalty in both cost and write performance. Systems like ONTAP/GX can also make more intelligent use of flash or SSD drives 48 by migrating entire volumes to storage aggregates comprised entirely of SSD; in the ONTAP/GX case, this would allow portions of a namespace to be moved to SSD, although only in its entirety, only all at one time, and only at pre-defined volume boundaries.
It is in this context that this invention operates. This invention allows the mixing of SSD and normal rotating magnetic drives (hereafter referred to as hard disk drives, or HDDs) in the same file system, instead of as a cache or as a separate type of aggregate, and provides a number of policy mechanisms for controlling the exact placement of files within the collection of storage pools.
This provides a number of advances over the state of the art. First, because individual files can be split, at the block level, between SSD and HDD storage, or between other types of persistent storage, this server can place storage optimally at a very fine level of granularity. For example, consider the case of a media server where files are selected randomly for playing, but where, once selected, the entire file is typically read sequentially. This system could store the first megabyte or so in SSD, with the remainder of what may well be a 200 MB or larger file stored on much less expensive HDD. The latency for retrieving the first 1 MB of data would be very low because the data is stored in SSD, and by the time that this initial segment of data has been delivered to the NAS client, the HDD could be transferring data at its full rate, after having performed its high latency seek operation concurrent with the initial segment's transfer from SSD. To fully benefit from this flexibility in allocation, the storage system needs to apply allocation policies to determine where and when to allocate file space to differing types of storage. This invention allows policies to be provided globally, or for individual exports of either NAS or SAN data, or for arbitrarily specified subtrees in a NAS name space.
As compared with prior art that allows whole volumes to be relocated from HDD to SSD, or vice versa, this invention provides many benefits. First, the invention requires neither entire volumes nor even entire files to move between storage types, and second, data can be placed in its optimal location initially, based on the specified policies. As compared with using flash in a victim cache, use of flash memory as file system storage allows improvement in write operation performance, since data written never needs to be stored on HDD at all. In addition, this invention allows policies to change at any directory level, not just at volume boundaries.