As is known in the art, large host computers and servers (collectively referred to herein as “host computer/servers”) require large capacity data storage systems. These large computer/servers generally include data processors, which perform many operations on data introduced to the host computer/server through peripherals including the data storage system. The results of these operations are output to peripherals, including the storage system.
One type of data storage system is a magnetic disk storage system having a bank of disk drives. The bank of disk drives and the host computer/server are coupled together through a system interface. The interface includes “front end” or host computer/server controllers (or storage processors) and “back-end” or disk controllers (or storage processors). The interface operates the storage processors in such a way that they are transparent to the host computer/server. That is, user data is stored in, and retrieved from, the bank of disk drives in such a way that the host computer/server merely thinks it is operating with its own local disk drive. One such system is described in U.S. Pat. No. 5,206,939, entitled “System and Method for Disk Mapping and Data Retrieval”, inventors Moshe Yanai, Natan Vishlitzky, Bruno Alterescu and Daniel Castel, issued Apr. 27, 1993, and assigned to the same assignee as the present invention.
As described in such U.S. Patent, the interface may also include, in addition to the host computer/server storage processors and disk storage processors, a user data semiconductor global cache memory accessible by all the storage processors. The cache memory is a semiconductor memory and is provided to rapidly store data from the host computer/server before storage in the disk drives, and, on the other hand, store data from the disk drives prior to being sent to the host computer/server. The cache memory being a semiconductor memory, as distinguished from a magnetic memory as in the case of the disk drives, is much faster than the disk drives in reading and writing data. As described in U.S. Pat. No. 7,136,959 entitled “Data Storage System Having Crossbar Packet Switching Network”, issued Nov. 14, 2006, inventor William F. Baxter III, assigned to the same assignee as the present invention, the global cache memory may be distributed among the service processors.
Another data storage system is described in U.S. Patent Application Publication No. US 2005/0071424, entitled DATA STORAGE SYSTEM, inventor Baxter III, published Mar. 31, 2005, assigned to the same assignee as the present invention. In such system, front and back end directors (hereinafter referred to as storage processors) include: a message engine, a data pipe and a portion of a global cache memory. The front and back end storage processors are interconnected through a packet switching network. The packet switch network passes both user data and messages, the user data passing through the data pipe and the messages being generated and received by the message engine. Write data supplied by the host computer/server for storage in the bank of disk drives is passed to the local cache memory section of one of the second plurality of storage processor/memory boards and the storage processor on such one of the second plurality of storage processor/memory boards controls the transfer of data from such one of the memory sections to the bank of disk drives. Read data supplied by the bank of disk drives for use by the host computer/server is passed to the local cache memory section of one of the first plurality of storage processor/memory boards and the storage processor on such one of the first plurality of storage processor/memory boards controls the transfer of data from such one of the memory sections to the host computer/server. The front-end and back-end storage processors control the transfer of user data between the host computer/server and the bank of disk drives through the packet switching networks in response to messages passing between and/or among the storage processors through the packet switching networks.
As is also known in the art, it is desirable to maximize user data transfer through the interface including maximized packet transfer through the packet switching network.
As is also known in the art, each one of the storage processors includes a CPU and a local/remote memory interconnected to the packet switching network through commercially available root complex, such as an INTEL root complex using a PCI-Express (PCIE) protocol. One such packet switching network operates with a Serial Rapid IO (SRIO) protocol and is sometimes referred to as an SRIO fabric. We have discovered that for certain system interfaces, greater system throughput can be achieved using an SRIO fabric. The benefits of SRIO over other packet switched protocols such as Ethernet for storage applications is that SRIO has guaranteed delivery (since every request has associated response), supports low latency applications (since maximum packet payload size is 256 bytes) while maintaining reasonable bandwidth (of about 1 Gbyte/sec per direction), and can be implemented in a low cost, structured ASIC designs since protocol complexity is minimal.
It should be noted that some SRIO terminology used herein may be found in the following references published by the RapidIO Trade Association:                Rapid IO Interconnect Specification, version 1.3        Rapid IO Interconnect Specification, Part VI: Physical Layer 1×/4× LP-Serial Specification;        some of the PCI terminology used herein may be found in the following references published by the PCI-SIG (Peripheral Component Interconnect Special Interest Group):        PCIE Express Base Specification, version 1.1; and        other terminology used herein may be found in INCITS: T10 Technical Committee on SCSI Storage Interfaces—Preliminary DIF (Block CRC) documents        
As is also known in the art, a DSA transfer is used for a CPU within a storage processor (SP) to indirectly access a local/remote memory in any SP on the packet switching network. More particularly, as used herein, a DSA transfer is “indirect” because in the present system the CPU is “detached” from the operation as soon as the DSA operation is initiated from the CPU. Once initiated, the CPU is free to perform other work (if there is work not dependent on a DSA in flight) until the DSA transfer is completed. When the DSA is completed, the DSA status and data (if applicable) is “pushed” into the initiating, or source SP's local memory and an interrupt generated to the initiating CPU for completion notification. (Polling of the DSA status word in local memory is also possible for absolute lowest latency when no forward progress can be made until the DSA transfer is completed).
However, existing SRIO fabrics do not support DSA or atomic transfers with commercially available root complexes. More particularly, PCI-Express (PCIE) standard does not directly support atomic operations and the RIO standard support for atomic operations is limited.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.