The present invention relates in general to the field of write operations in network storage environment, and in particular to a method for I/O write request handling in a storage system, and a corresponding storage system. Still more particularly, the present invention relates to a data processing program and a computer program product for I/O write request handling in a storage system.
The technical area of this invention is about caching strategies and their application in storage system, especially a storage system supporting the CIFS protocol, although not limited only for this application. A cache in the context of this invention is a component that transparently stores data so that requests for that data can be served faster.
Storage systems provide their internal storage capacity via a network to servers, personal computers, and mobile devices. Such a storage system comprises two or more controllers which present a storage capacity of the storage system via external network to one or more external devices, e.g. server, personal computer, mobile device, and smart device. Each storage controller can have multiple access points to external network. The storage controller stores incoming data on internal storage media, e.g. disk drive, tape cartridge, Solid State Disk (SSD), none volatile RAM, via internal network, e.g. Fibre Channel, TCP/IP, Ethernet, InfiniBand. The storage system can comprise multiple internal storage media and multiple internal networks. Storage systems with more than two controllers are also called clustered storage systems.
A connection from an external device via network and an access point to a controller is used to read and write data to the storage system. File-based data protocols like NFS, CIFS, FTP and HTTP are used to handle the data transfer between the external device and the storage systems.
the processing of an I/O request takes too long, the I/O handler on the external device generates a time out error to the application. Applications may not be well prepared for such errors so these errors may result in an application abort, potentially leaving inconsistent data on the Storage system (data corruption).
File-based protocols have a much smaller time-out compared to block-based communication, as block based communication uses more reliable connection such as Fibre Channel. As a result of longer time-outs and more reliable connections, time-out is a problem that is rather rare for block I/O based communication.
To speed up the processing of I/O requests, Storage systems usually are trimmed for performance, thus reducing the likelihood of a long running I/O request. The use of a cache on the storage system is common practice to increase overall performance.
In a write scenario, a cache may not have any free capacity to accept a particular write request. This results in a slow response as the write operation needs to bypass the cache to the stow storage or may need to wait for a cache unit to be freed by a background write to slow storage of the current cache content. Ideally the (write) caching will reduce the amount of slow writes for a given load of the overall system.
The storage media may be slow compared to a cache device, it is also a complex device by itself. It may operate well most of the time, but occasionally may require recalibration or internal recovery causing excessive response time (such as a rebuild of a RAID array which is a quite I/O extensive operation). While the average performance may not be significantly reduced, some single IO requests may take very long. In conjunction with a cache not available (the cache has no free capacity) and IO requested via a file-based protocol such as CIFS, this may lead to an application time out on the CIFS client and as seen on the client applications to corruption.
Standard mitigation is to use a faster storage system, more cache or to reduce the overall load. Faster storage technology and a larger cache typically increase the cost significantly while reducing the client load is not an option either (in fact typically is out of control of the storage system itself).