A storage system is a processing system adapted to store and retrieve information/data on storage devices (such as disks). The storage system includes a storage operating system that may implement a file system to logically organize the information as a hierarchical structure of directories and files on the storage devices. Each file may comprise a set of data blocks, whereas each directory may be implemented as a specially-formatted file in which information about other files and directories are stored.
The storage operating system generally refers to the computer-executable code operable on a storage system that manages data access and access requests (read or write requests requiring input/output operations) and may implement file system semantics in implementations involving storage systems. In this sense, the Data ONTAP® storage operating system, available from NetApp, Inc. Sunnyvale, Calif., which implements a Write Anywhere File Layout (WAFL®) file system, is an example of such a storage operating system implemented as a microkernel within an overall protocol stack and associated storage. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
A storage system's storage is typically implemented as one or more storage volumes that comprise physical storage devices, defining an overall logical arrangement of storage space. Available storage system implementations can serve a large number of discrete volumes. A storage volume is “loaded” in the storage system by copying the logical organization of the volume's files, data, and directories, into the storage system's memory. Once a volume has been loaded in memory, the volume may be “mounted” by one or more users, applications, devices, and the like, that are permitted to access its contents and navigate its namespace.
A storage system may be configured to allow server systems to access its contents, for example, to read or write data to the storage system. A server system may execute an application that “connects” to the storage system over a computer network, such as a shared local area network (LAN), wide area network (WAN), virtual private network (VPN) implemented over a public network such as the Internet, or Storage Area Network (SAN). The application executing on the server system may send an access request (read or write request) to the storage system for accessing particular data stored on the storage system.
The storage system may typically implement large capacity storage devices, comprising disk devices, for storing data. As known in the art, a disk device stores data on sectors, a sector comprising a minimum data size for input/output (I/O) operations (such as read/write requests) of the disk device. Each sector stores a fixed amount of user-accessible data (client data), the sector size being 512 bytes (referred to herein as a legacy sector size) for conventional disk devices (referred to as legacy disk devices). As such, a legacy disk device may store client data (e.g. data received from an application) in 512 byte addressable sectors. Currently, advanced disk devices, known as Advanced Format disk devices, store client data in sectors comprising 4,096 bytes (referred to as 4 k bytes) or more (referred to herein as advanced sector sizes). As such, an advanced disk device may store client data in at least 4,096 byte addressable sectors. The larger sector sizes of advanced disk devices have been caused by various reasons, including the increasing data sizes of the storage volumes of client data. Due to the larger amounts of client data to be stored by disk devices, the conventional minimum I/O size of 512 bytes may be inadequate and the higher minimum I/O sizes of advanced disk devices are being implemented by most disk device manufacturers, with the intention of further increasing the sector size occasionally over time.
Although advanced disk devices are increasingly being used, there still persists large numbers of legacy systems comprising legacy applications, legacy volumes, and legacy disk devices. Legacy applications may submit read/write requests based on a legacy sector size to legacy volumes comprising data formatted based on a legacy sector size, the legacy volumes being stored on legacy disk devices comprising sectors of the legacy sector size. Since legacy disk devices are being phased out by disk manufacturers, issues occur when a legacy disk devices fail and are replaced by advanced disk devices. The data of the legacy volumes previously stored on the legacy disk devices are still typically formatted based on a legacy sector size, but are stored to advanced disk devices that are based on an advanced sector size. For example, the legacy volumes may comprise data blocks formatted and addressed based on 512 byte sectors, but stored to advanced disk devices. Also, legacy applications will still assume that the legacy volume is stored to a legacy disk device and will still submit access requests (read/write requests) that are based on 512 byte sectors (e.g., specify a storage address that is based on 512 byte sectors).
As such, emulation methods have been developed and implemented on advanced disk devices to emulate legacy disk devices for legacy applications and legacy volumes. Conventionally, when an advanced disk device receives a legacy access request for a legacy volume, the advanced disk device may execute the emulation methods to perform 1) storage address conversion, and 2) emulation I/O operations. The storage address conversion may convert the received storage address that is based on 512 byte sectors (received in the access request) to a converted storage address that is based on 4 k byte sectors. The converted storage address may comprise a storage address corresponding to a start of a corresponding 4 k byte sector, and an offset position within the corresponding 4 k byte sector. The emulation I/O operations may perform various I/O operations on client data depending on whether the legacy access request is a read or write request. Note that performing the storage address conversion is a simple and straightforward procedure requiring minimal time or resources of the disk device. Performing the emulation I/O operations, however, typically requires significantly more time and resources of the disk device.
A legacy read request will typically request reading of one or more data blocks that start at a storage address, each data block being based on a 512 byte sector and comprising 512 bytes of data. The received storage address is based on 512 byte sectors and is converted to a storage address based on 4 k byte sectors that locates a corresponding 4 k byte sector and an offset position within the corresponding 4 k byte sector. For a legacy read request, the emulation I/O operations may include retrieving the corresponding 4 k byte sector from the disk device at the converted storage address. The corresponding 4 k byte sector may comprise a 4 k byte sector on the disk device that contains the requested data blocks. For example, the legacy read request may request three 512-byte data blocks that are stored within the corresponding 4 k byte sector on the disk device. As each 4 k byte sector stores eight 512-byte data blocks, the corresponding 4 k byte sector comprises the three requested 512-byte data blocks and five additional 512-byte data blocks. Since the minimum I/O size of the advanced disk device is a 4 k byte sector, the emulation method must read all eight 512-byte data blocks of the corresponding 4 k byte sector and stores them to a disk memory. From disk memory, the emulation method may then retrieve and return the three requested 512-byte data blocks at the offset position and ignore the five additional 512-byte data blocks. Emulation for legacy read requests has been shown to be a relatively efficient and data-secure process.
Emulation for legacy write requests, however, have been more problematic for advanced disk devices. Emulation of legacy write requests may become difficult since the minimum write size of the advanced disk device is a 4 k byte sector. A legacy write request will typically comprise one or more data blocks (write data blocks) to be written starting at a storage address, each write data block being based on a 512 byte sector and comprising 512 bytes. The received storage address is based on 512 byte sectors and is converted to a storage address (based on 4 k byte sectors) of a corresponding 4 k byte sector and an offset position within the corresponding 4 k byte sector.
For a legacy write request, the emulation I/O operations may include retrieving a corresponding 4 k byte sector from the disk device at the converted storage address. The corresponding 4 k byte sector may comprise a 4 k byte sector on the disk device where the write data blocks are to be written/stored. For example, the legacy write request may comprise three 512-byte data blocks that are to be stored within the corresponding 4 k byte sector on the disk device. The emulation method reads all eight 512-byte data blocks of the corresponding 4 k byte sector and stores them to a disk memory. In disk memory, the emulation method may then insert the three write data blocks at the determined offset position within the corresponding 4 k byte sector, and then write all eight 512-byte data blocks of the modified corresponding 4 k byte sector from disk memory to the disk device at the converted storage address.
Emulation for legacy write requests have typically shown performance and data integrity issues. In regards to performance, the extra steps of reading a 4 k byte sector of data, inserting 512-byte data blocks, and then rewriting the entire 4 k byte sector of data requires significant time and resources of the advanced disk device. As such, the advanced disk device will be capable of performing fewer writes in a given amount of time, reducing the overall throughput of the advanced disk device. In regards to data integrity, while the write data is being written to the disk device, interruptions (power, or otherwise) may cause the write data to be lost. Between the time the write request is received at the disk device, and the time the disk device returns a completion message (indicating that the write request is successfully completed on the disk device), the write data is considered to be indeterminate. If an interruption prevents successful completion of the write request on the disk device, the responsibility falls to the application issuing the write request, not the disk device. As such, the application must recognize the failure of the write request and reissue the write request. Write emulation also creates challenges for the disk device, since it is writing more data than requested. If an interruption occurs, the application will generally only be capable of reissuing its write data. The disk device, however, is responsible for the rest of 4 k byte sector that is to be written.
Steps are being taken to allow disk devices to maintain data integrity for interrupted writes, but this is still a new and untested area of responsibility for disk devices. Flaws in this advanced disk technology may manifest themselves as data corruptions observed by customers in the field. As such, a system and method for mitigating write emulation on a disk device is needed.