1. Field of the Invention
The present invention relates generally to storage area networks, and in particular to a system for synchronizing task processing.
2. Description of the Related Art
The management of information is becoming an increasingly daunting task in today's environment of data intensive industries and applications. More particularly, the management of raw data storage is becoming more cumbersome and difficult as more companies and individuals are faced with larger and larger amounts of data that must be effectively, efficiently, and reliably maintained. Entities continue to face the necessity of adding more storage, servicing more users, and providing access to more data for larger numbers of users.
The concept of storage area networks or SANs has gained popularity in recent years to meet these increasing demands. Although various definitions of a SAN exist, a SAN can generally be considered a network whose primary purpose is the transfer of data between computer systems and storage elements and among storage elements. A SAN can form an essentially independent network that does not have the same bandwidth limitations as many of its direct-connect counterparts including storage devices connected directly to servers (e.g., with a SCSI connection) and storage devices added directly to a local area network (LAN) using traditional Ethernet interfaces, for example.
In a SAN environment, targets, which can include storage devices (e.g., tape drives and RAID arrays) and other devices capable of storing data, and initiators, which can include servers, personal computing devices, and other devices capable of providing write commands and requests, are generally interconnected via various switches and/or appliances. The connections to the switches and appliances are usually Fibre Channel. This structure generally allows for any initiator on the SAN to communicate with any target and vice versa. It also provides alternative paths from initiator to target. In other words, if a particular initiator is slow or completely unavailable, another initiator on the SAN can provide access to the target. A SAN also makes it possible to mirror data, making multiple copies available and thus creating more reliability in the availability of data. When more storage is needed, additional storage devices can be added to the SAN without the need to be connected to a specific initiator, rather, the new devices can simply be added to the storage network and can be accessed from any point.
Some SANs utilize appliances to perform storage management for the SAN. A typical appliance may receive and store data within the appliance, then, with an internal processor for example, analyze and operate on the data in order to forward the data to the appropriate target(s). Such store-and-forward processing can slow down data access, including the times for reading data from and writing data to the storage device(s).
An example of a SAN is shown in the system 100 illustrated in the functional block diagram of FIG. 1. As shown, there are one or more servers 102. Three servers 102 are shown for exemplary purposes only. Servers 102 are connected through an Ethernet connection to a LAN 106 and/or to a router 108 and then to a WAN 110, such as the Internet. In addition, each server 102 is connected through a Fibre Channel connection to each of a plurality of Fibre Channel switches 112 sometimes referred to as the “fabric” of the SAN. Two switches 112 are shown for exemplary purposes only. Each switch 112 is in turn connected to each of a plurality of SAN appliances 114. Two appliances 114 are shown for exemplary purposes only. Each appliance is also coupled to each of a plurality of storage devices 116, such as tape drives, optical drives, or RAID arrays. In addition, each switch 112 and appliance 114 is coupled to a gateway 118, which in turn is coupled to router 108, which ultimately connects to a Wide Area Network (WAN) 118, such as the Internet. FIG. 1 shows one example of a possible configuration of a SAN 119, which includes switches 112, appliances 114, storage devices 116, and gateways 118. Still other configurations are possible. For instance, one appliance may be connected to fewer than all the switches.
SANs, typically through switches and/or appliances, perform virtualization functions to allocate space of one or more physical targets to a particular user with the physical space remaining unknown to the user. For example, a company may utilize a SAN to provide data storage that employees access for data storage and retrieval. An engineering department, for example, may have storage allocated as “engineering storage space.” The employees may see and interact with the virtual space as they would see or interact with a physical storage device such as an attached hard disk drive. Nevertheless, the space may actually be divided over multiple physical storage devices and even be fragmented within single storage devices. A switch or appliance can receive a request for a virtual space and block number(s) and determine the device(s) and portions thereof that physically correlate to the virtual space requested in order to direct the data accordingly.
In general, SANs are formed using a single protocol to interconnect the devices. Although Fibre Channel is the most commonly used, Ethernet connections have also been used. Nonetheless, if both protocols are desired to be used, some kind of transition between the two protocols must occur. In such instances, a Fibre Channel SAN 119 is typically coupled to an Ethernet SAN 122 via a bridge 121. To transition from one protocol to the other, a packet is received by the bridge and stored in memory. Once the packet is stored in a memory, a processor operates on the packet to remove the headers of one protocol and build the headers of the other protocol, thereby constructing an entirely new packet.
While appliances can perform switching operations, switches are often used to connect initiators with appliances, given the large number of initiators and small number of ports included in many appliances. In more current SAN implementations, switches have replaced certain functionality previously preformed by appliances such that appliances are not necessary and can be eliminated from the systems.
More recent storage area network switches are capable of routing data between initiators and targets without buffering the data as required by earlier appliances used in SANs. For example, some storage switches can route data packets without introducing more latency to the packets than would be introduced by a typical network switch. Such unbuffered data transfer between initiators and targets must be handled reliably and efficiently by the switch performing the interconnection. An example of a storage switch can be found in U.S. Pat. No. 7,864,758, entitled VIRTUALIZATION IN A STORAGE SYSTEM, filed Jan. 18, 2002, previously incorporated herein by reference.
As disclosed in U.S. Pat. No. 7,864,758, a storage switch may include one or more line cards for establishing connections to the servers and storage devices. Each line card may include Packet Processing Units (PPUs) for performing virtualization and protocol transmission on the fly (i.e., no buffering). Each line card may include an ingress PPU for receiving data packets into the switch, and an egress PPU for sending data packets out from the switch.
It is essential that the ingress PPU and the egress PPU remain in synchronization with each other. For example, when the initiator (e.g., the server) sends a task, such as a request to write data to the storage device, this request is received by the ingress PPU, which in turn forwards the request to a line card traffic manager. If the storage device is ready to receive the request, the storage device sends a transfer ready response to the egress PPU via the line card traffic manager, and the egress PPU in turn forwards the response back to the initiator to send the data. It is important that the transfer ready response generated by the egress PPU is for the specific request generated by the ingress PPU. Otherwise the request will not complete successfully.
However, it may happen that a task is forwarded by the ingress PPU and for a variety of reasons the transfer ready response for that task is not received back from the egress PPU. In such an instance, the PPU will timeout while waiting for the response. For example, the storage device may be unavailable. In such an instance, the system may become deadlocked waiting for the response. Alternatively, if and when the egress PPU does generate the transfer ready response, the ingress PPU may have received a new request, and there is no guarantee that the response from the egress PPU is synchronized to the request from the ingress PPU. There is not a reliable messaging mechanism between the ingress PPU and the egress PPU. Additionally because there is not a reliable messaging mechanism between the ingress PPU and the egress PPU, the PPUs must be capable of recovering from any request and response loss whether it is caused by a storage switch internal or external error.
Both the ingress PPUs and egress PPUs have memories, such as for example a static random access memory (SRAM). However, the ingress and egress PPU memories cannot be shared, as it is imperative for performance that both the ingress and egress PPUs execute separately and independently of each other. Nor is it conceivable to use a hardware interlock mechanism in the storage switch to maintain synchronization, as this would result in performance loss and increased logic space. While it may further be possible to provide a buffered implementation provide reliable messaging, again, this will adversely affect system performance.