1. Field of the Invention
The invention relates generally to storage systems and more specifically relates to methods and structure for efficient, reliable buffer allocation for processing within a storage controller of a clustered storage system.
2. Discussion of Related Art
In the field of data storage, customers demand highly resilient data storage systems that also exhibit fast recovery times for stored data. One type of storage system used to provide both of these characteristics is known as a clustered storage system.
A clustered storage system typically comprises a number of storage controllers, wherein each storage controller processes host Input/Output (I/O) requests directed to one or more logical volumes. The logical volumes reside on portions of one or more storage devices (e.g., hard disks) coupled with the storage controllers. Often, the logical volumes are configured as Redundant Array of Independent Disks (RAID) volumes in order to ensure an enhanced level of data integrity and/or performance.
A notable feature of clustered storage environments is that the storage controllers are capable of coordinating processing of host requests (e.g., by shipping I/O processing between each other) in order to enhance the performance of the storage environment. This includes intentionally transferring ownership of a logical volume from one storage controller to another. For example, a first storage controller may detect that it is currently undergoing a heavy processing load, and may assign ownership of a given logical volume to a second storage controller that has a smaller processing burden in order to increase overall speed of the clustered storage system. Other storage controllers may then update information identifying which storage controller presently owns each logical volume. Thus, when an I/O request is received at a storage controller that does not own the logical volume identified in the request, the storage controller may “ship” the request to the storage controller that presently owns the identified logical volume.
FIG. 1 is a block diagram illustrating an example of a prior art clustered storage system 150. Clustered storage system 150 is indicated by the dashed box, and includes storage controllers 120, switched fabric 130, and logical volumes 140. Note that a “clustered storage system” (as used herein) does not necessarily include host systems and associated functionality (e.g., hosts, application-layer services, operating systems, clustered computing nodes, etc.). However, storage controllers 120 and hosts 110 may be tightly integrated physically. For example, storage controllers 120 may comprise Host Bus Adapters (HBA's) coupled with a corresponding host 110 through a peripheral bus structure of host 110. According to FIG. 1, hosts 110 provide I/O requests to storage controllers 120 of clustered storage system 150. Storage controllers 120 are coupled via switched fabric 130 (e.g., a Serial Attached SCSI (SAS) fabric or any other suitable communication medium and protocol) for communication with each other and with a number of storage devices 142 on which logical volumes 140 are stored.
FIG. 2 is a block diagram illustrating another example of a prior art clustered storage system 250. In this example, clustered storage system 250 processes I/O requests from hosts 210 received via switched fabric 230. Storage controllers 220 are coupled for communication with storage devices 242 via switched fabric 235, which may be integral with or distinct from switched fabric 230. Storage devices 242 implement logical volumes 240. Many other configurations of hosts, storage controllers, switched fabric, and logical volumes are possible for clustered storage systems as a matter of design choice. Further, in many high reliability storage systems, all the depicted couplings may be duplicated for redundancy. Additionally, the interconnect fabrics may also be duplicated for redundancy.
While clustered storage systems provide a number of performance benefits over more traditional storage systems described above, the speed of a storage system still typically remains a bottleneck to the overall speed of a processing system utilizing the storage system.
In many high-performance, high-reliability storage systems, the storage controllers have numerous processes operable to process corresponding aspects of received I/O requests. These various processes have a number of uses for allocating buffers in performing their respective processes. One process may allocate a buffer to copy a portion of an I/O request for its use while other processes may allocate buffers to build data structures used in processing the request. In a clustered storage environment where a plurality of storage controllers engage in close cooperation in processing requests to logical volumes they own and to transfer such ownership, still other processes may build structures used when shipping (transferring) an I/O request to another controller for processing or in communicating with other controllers with regard to ownership of a logical volume.
Each process may be designed in the controller logic with some dedicated block of memory that the corresponding process manages on its own. However, such a fixed allocation of memory to each process does not provide flexibility where some processes require more memory at times while others may require less. Thus, such a fixed allocation may be wasteful of memory in that a maximum capacity of memory that may be required for each process will be allocated to assure each process can continue processing even though it may not need that maximum amount. It is therefore often preferred that buffers be allocated (and freed) for each of the various processes using some common pool of available memory. Each process may then allocate buffers as needed and release the allocated buffers when no longer required. Numerous issues arise in allocating buffers for multiple processes from a common pool in view of the varying requirements for each process over time. Deadlock situations must be avoided such that two or more tasks cannot proceed in processing because each is waiting to allocate more memory and no task is ready to release/free its memory to make more buffers available. Further, performance of the storage controller can be impacted by buffer allocation in a common pool. For example, a process may be stalled waiting for other processes to release buffers. Still further, performance may be impacted if the process awaiting more buffer allocation is constantly retrying its allocation. Such a “polling” loop structure may consume valuable processing resources within the controller.
Thus it is an ongoing challenge to provide for efficient buffer allocation in a storage controller that avoids deadlock and performance problems.