1. Technical Field
The present disclosure relates to storage systems.
2. Related Art
Information in storage infrastructures today is stored in various storage devices that are made accessible to clients via computing systems. A typical storage infrastructure may include a storage server and a storage subsystem having an expander device (for example, a Serial Attached SCSI (SAS) Expander) and a set of mass storage devices, such as magnetic or optical storage based disks or tapes. The mass storage devices may comply with various industry protocols and standards, for example, the SAS and Serial Advanced Technology Attached (SATA) standards.
The storage server is a special purpose processing system that is used to store data on behalf of one or more clients. The storage server stores and manages shared files at the mass storage device.
The expander device (for example, a SAS Expander) is typically used for facilitating communication between plurality of mass storage devices within a storage subsystem. The expander device in one storage subsystem may be operationally coupled to expander devices in other storage subsystems. The expander device typically includes one or more ports for communicating with other devices within the storage infrastructure. The other devices also include one or more ports for communicating within the storage infrastructure.
The term port (may also be referred to as “PHY”) as used herein means a protocol layer that uses a transmission medium for electronic communication within the storage infrastructure. A port typically includes a transceiver that electrically interfaces with a physical link and/or storage device.
For executing input/output (“I/O”) operations (for example, reading and writing data to and from the mass storage devices), a storage server typically uses a host bus adapter (“HBA”, may also be referred to as an “adapter”) to communicate with the storage infrastructure devices (for example, the expander device and the mass storage devices). To effectively communicate with the storage infrastructure devices, the adapter performs a discovery operation to discover the various devices operating within the storage infrastructure at any given time. This operation may be referred to as “topology discovery”. A topology discovery operation may be triggered when a PHY changes its state and sends a PHY state change notification. A change in PHY state means, whether a PHY is ready or “not ready” to communicate. A PHY may change its state due to various reasons, for example, due to problems with cable connections, storage device connections or due to storage device errors within a storage subsystem.
When a PHY changes its state, typically, a notification is sent out to other devices, for example, to the adapter and the expander device. After receiving a PHY state change notification, the adapter performs the topology discovery operation. A certain number of PHY state change notifications are expected during normal storage infrastructure operations. However, if a PHY starts repeatedly sending PHY state change notifications, then instead of efficiently executing operations, the adapter in response to the notifications repeatedly performs discovery operations. This may result in complete loss of operation execution is negatively impacted because the adapter resources are used for discovery operations rather than solely executing I/O operations.
In conventional storage infrastructures in general, and in storage subsystems in particular, a PHY state change notification is handled like an ordinary event whose purpose is to trigger a topology discovery operation. PHI state change notifications are typically, not analyzed as an error condition or as a potential error indicator that may trigqer another action besides the standard, topology discovery operation.
Users today expect to efficiently perform operations to access information stored at storage devices with minimal disruption or loss of service. Therefore, there is a need for efficiently managing PHY state change notifications in storage infrastructures so that one can reduce the chances of loss of service in performing I/O operations and disruption in executing I/O operations.