The present invention relates to storage devices, and more particularly to adapters for interfacing a storage device to a host for high speed, high reliability and high availability data storage.
Various interfaces are known for interconnecting host computer systems to storage systems, such as disk drive mass storage systems. Particular interfaces involve respective particular data transfer protocols and system interconnections. The IBM Enterprise System Connection Architecture (ESCON) is a flexible interface or interconnection environment that is used to move data from a host source to storage and back. ESCON, as it is known in the art, combines technology and architecture to include fiber optic cabling for transmission and reception of data. ESCON implements dynamic connectivity through switched point to point topology and data flow interconnectivity with other networks. The ESCON I/O architecture exploits the fiber optic technology implementation and the concept of dynamic connectivity in providing a flexible, extensible interface.
The ESCON architecture has become a widely accepted standard for data communications in large scale computer environments, replacing the more traditional IBM xe2x80x9cBus and Tagxe2x80x9d protocol. ESCON employs serially encoded data transmission techniques in place of parallel data protocols.
The ESCON architecture provides inherent advantages such as: information transfer using fiber optic technology; higher rates of data transfer (17 MBytes/sec); extended distances (3 km typical with extension to 60 km); dynamic connectivity through switched point-to-point data flow; and interconnectivity with local and wide area networks
Application of ESCON as the means for data communications can, in addition to the asserted enhancements in data throughput rates and distance, provide other meaningful benefits such as: greater connection flexibility of equipment; reduced floor loading due to significant cable weight and size reductions; customer configuration expansion or reconfiguration with minimal or no disruption; increased data integrity and security; and reduced cost of ownership through more effective utilization of equipment.
The ESCON architecture has been adopted by many other manufacturers of computer equipment as a basic input/output protocol and it has been accepted as an ANSI Standard (xe2x80x9cSBCONxe2x80x9d). The technical details of the IBM ESCON interface are described, among other places, in various IBM publications including INTRODUCING ENTERPRISE SYSTEMS CONNECTION, IBM 3990 ESCON FUNCTION, INSTALLATION AND MIGRATION, IBM 3990 STORAGE CONTROL ESCON FEATURES PRESENTATION GUIDE, ENTERPRISE SYSTEMS ARCHITECTURE/390 ESCON I/O INTERFACE AND ENTERPRISE SYSTEMS ARCHITECTURE, which are incorporated herein by reference.
The various devices interconnected to a host system with ESCON I/O include storage systems known as xe2x80x9cIntegrated Cached Disk Arrays,xe2x80x9d (xe2x80x9cICDAsxe2x80x9d), which are typically an array of small inexpensive disk drives integrated into a single chassis. High speed caches are implemented between the host and disks in ICDAs to yield improved performance. One family of known ICDA products, known as SYMMETRIX produced by EMC Corporation, Hopkinton, Mass., provides a high reliability array of drives, and offers great flexibility in terms of performance enhancements such as: mirroring; greater data availability; greater data transfer rates over distributed buses; and various levels of redundancy implemented in systems referred to as xe2x80x9cRAID systemsxe2x80x9d (xe2x80x9cRedundant Arrays of Inexpensive Disksxe2x80x9d).
The EMC2 Symmetrix architecture, generally illustrated in FIG. 1, integrates a high speed cache or global memory between a disk array and a host computer or CPU. The functional elements generally required to integrate the cache include a host-to-cache interface (which in one implementation is an IBM standard ESCON interface referred to as a host xe2x80x9cESCON Adapterxe2x80x9dxe2x80x94EA), and a cache-to-disk drives interface (which may be a Small Computer Systems Interface, xe2x80x9cSCSIxe2x80x9d, referred to as a xe2x80x9cDisk Adapterxe2x80x9dxe2x80x94DA). The EA and DA interface boards are generically referred to as xe2x80x9cDirectorsxe2x80x9d. The Symmetrix architecture operates under a xe2x80x9ccache allxe2x80x99xe2x80x9d policy, meaning that all transfers, i.e. from the host to the drives or from the drives to the host, go through cache. The principal function of the Directors is the movement of data between the host and Global Memory (cache) or between Global Memory and the Disk Drives.
The Global Memory Bus (GMB), between the EA and cache and between the DA and cache in Symmetrix, actually consists of two portions or identical buses designated xe2x80x9cAxe2x80x9d and xe2x80x9cBxe2x80x9d. The use of two buses improves performance and eliminates a possible single point of failure. Each bus has independent arbitration and consists of a 32 bit address bus plus 1 parity, a 64 bit data bus with 8 bits of Error Correction Code (ECC) check bits and a number of control lines with parity. The smallest data transfer that may take place over the GMB is 64 bits during each access (however, byte, word and longword operations are performed within a Director). The SYMMETRIX family of ICDAs are described in detail in the Symmetrix Product Manuals (for Models 5500, 52XX, 5100, 3500, 32XX and 3100) which are incorporated herein by reference.
The ESCON adapter interface(s) and disk adapter interface(s), i.e. the bus(es) between the host and EAs and between the disk array and DAs respectively, are 8 bit and in some cases 16 bit interfaces in Symmetrix. Thus the bytes received from a host have to be assembled into a 64-bit memory word for transfer to Global Memory, i.e. cache. Similarly, 64-bit memory words from Global Memory have to be disassembled into bytes for transmission over the interface(s). Assembly/disassembly of data to/from Global Memory is carried out by plural gate arrays located on each Director.
The Symmetrix family of ICDAs is designed around a pipelined architecture. A pipeline or pipe in the system is a registered path along which data is clocked to move it from one location to another.
The Directors, generally, are designed around a common pipelined architecture moving data along the xe2x80x9cpipexe2x80x9d or pipeline under microprocessor control. It is the pipelines that move the data when the system is operating, not the controlling microprocessors. The microprocessors set-up the pipes to perform a transfer and monitor the pipelines for errors during the transfer. The Directors in known Symmetrix systems incorporate a dual control processor architecture including a first processor referred to as xe2x80x9cXxe2x80x9d and a second processor referred to as xe2x80x9cYxe2x80x9d. The dual processor architecture is configured to share substantial resources in order to keep hardware requirements to a minimum. Each control processor in a xe2x80x9cfront endxe2x80x9d Director, i.e. transferring data to/from Global Memory, is typically responsible for two pipelines designated xe2x80x9cAxe2x80x9d and xe2x80x9cBxe2x80x9d and respective Direct Multiple Access (DMA) and Direct Single Access (DSA) pipelines, for Global Memory access.
Known Symmetrix systems can be configured with ESCON adapter circuitry for communication with hosts having ESCON I/O, as disclosed in U.S. patent application Ser. No. 08/753,673 which is incorporated herein by reference. FIG. 2 illustrates a known ESCON front end Adapter or Director, configured for transferring data between a host ESCON I/O (not shown) according to the IBM Bus ESCON standard known in the art and the ICDA incorporating the Global Memory. Data in accordance with the ESCON protocol is received on the ESCON Director by ESCON Receiver/Transmitter Gate Arrays (Rx/Tx GA) which receive and transmit ESCON data between the host ESCON I/O and the ESCON Director pipes in accord with the ESCON I/O protocol. ESCON data is transmitted in two physical pipes that are configured to operate as four pipes, two on each side: an A and B pipe (XA and XB for transmit and receive, respectively) on the X side controlled by an X control processor, and an A and B pipe (YA and YB for transmit and receive, respectively) on the Y side controlled by a Y control processor. Parallel ESCON data is passed in the pipes on the ESCON front end Director on a bidirectional bus connected to respective ESCON Receiver/Transmitter Gate Arrays (Rx/Tx GA), which in turn assemble/disassemble 64 bit memory words. Error detection and correction is performed in the ESCON pipe using a pass through EDAC device. Dual Port Ram (DPR) is implemented in the pipe for buffering data transferred between the ESCON host (not shown) and Global Memory.
Each processor on the ESCON Director has its-own Memory Data Register (MDR) to support DMA/DSA activity. The MDR provides a 72 bit wide register set, 64 bit data and 8 bits parity, comprised of upper and lower words that can be independently read or written by the respective microprocessor. Similarly, the MDR performs the assembly and disassembly of 64 bit Global Memory words.
The two microprocessors on the known ESCON Director perform functions associated with managing data flow to and from Global Memory, and additionally are accessible as individual points of contact for a user to configure memory and/or test and exercise the memory and pipelines (e.g. for diagnostic purposes). Such an implementation can disadvantageously interfere with the data flow management and adds complexity to the system. Complexity is inherent in the configuration(s) known in the art, in that the individual pipeline or line processor""s activities, beyond data flow management, must be coordinated with other line processors"" activities. Furthermore, involving line processors in other than data flow management leads to bottlenecks on the backplane bus that negatively affect system performance. For example, inter-processor communication, i.e. communication between line processors, interferes with the system""s data flow capabilities.
Known ICDA implementations, such as known Symmetrix systems, are designed according to a two level architecture. That is, the Director interface is a first level interface that works in conjunction with a second level adapter board that is directly connected to the host ESCON I/O. The ESCON Receiver/Transmitter Gate Arrays (Rx/Tx GA) on the Director transmit data to or receive data from the second level ESCON adapter board that effects data conversion using devices known in the art. The second level adapter board converts parallel data to serial ESCON data using a Cypress CY7B923 device (for transmission to the host), or converts serial ESCON data to parallel with a Cypress CY7B933 (for receipt by the ESCON Director). A processor on the second level adapter board handles protocol operations and basically translates ESCON information into a generic interface protocol, or interchange format. The adapter board includes memory associated with the adapter processor, and all of the data from the ESCON protocol engines is stored temporarily in this memory. The processor on the adapter then has to figure out by looking through the stored data where header information is and where the data is. The adapter processor then writes in the adapter memory information about where the data starts and ends. In addition, the processor interprets commands from the host and writes into the adapter memory what the command is and what the ESCON command means in a generic fashion.
The Director card processor then finds this information. The processor on the Director(s) polls through the adapter memory space to find directives from the adapter. For example, an ESCON Director polling the adapter memory could find a write operation along with an indication of where the data to be written is located. The converted ESCON information found can then be handled by the processor(s) on the Director(s), which take the data and actually do transfers to/from the Global Memory.
Such a two level architecture results in significant loss of performance, in part because of the polling for the data transfer information. With the two levels, the adapter card at one level has to figure out what the command is, where the data is, translate it to the interchange format and store all this in the adapter memory. At the second level, the Director polls and retrieves the interchange information. There is very little pipelining that can be effected in such an implementation. Inefficiencies result from the architecture wherein operations are completed at one level and handed off to another level. Parallelism is not exploited.
Similarly, known ICDA implementations, and other arrays of inexpensive disks that transfer data according to the known ESCON protocol, receive user data for disk storage in the context of a frame that includes header information and frame information. Known integrated cache systems typically store the data in the cache memory along with the header and frame information. Such an implementation is not an efficient use of the cache memory capacity. Disadvantageously, complex software has to be configured to effectively track the location of the data in memory among ESCON header/frame information. Movement of cached data, i.e. going to or from the disk array, in known systems requires that the data be discriminated from extraneous cached protocol information which negatively affects transmission speed and efficiency.
Additionally, storage of user data among header/frame information increases the chances of conflicts arising between line processors accessing the cache to retrieve user data and other line processors perusing the cache to ascertain pertinent header/frame information. Complex pointer management is required to maintain coherency between multiple processors accessing cache storing data and header/frame information.
The present invention provides an ESCON interface for a high performance integrated cache storage device implemented in an architecture having a high degree of parallelism and a single level of protocol conversion.
According to the invention, the ESCON interface is implemented as an interface between a host ESCON I/O and global memory and has an architecture that includes a plurality of pipelines each of which is controlled by a respective line processor. In one implementation, two pipelines are controlled by one processor and the interface is flexibly configured, under software control, to implement four or eight pipelines. Each pipeline receives/transmits ESCON I/O information over a respective fiber optic transceiver. An onboard ESCON protocol conversion device or protocol engine in each pipe is configured with the capability of distinguishing data which is customer, or user, data that is to be stored on a disk or is read from disk versus the headers and information that defines control frames in which the data is presented. Transmit and receive frame dual port rams (xe2x80x9cxmit frame DPRxe2x80x9d and xe2x80x9creceive frame DPRxe2x80x9d, respectively), for storing transmitted frame and received frame information, are connected to the protocol engines which strip the frame/header information from the user data. The control information is put into a format accessible via ring pointers and into a structure, i.e. the DPR, that is easily manipulated using the onboard line processor. The user data that is to be stored in Global Memory (xe2x80x9cGMxe2x80x9d) separated from the control information that comes from the host computer is stored temporarily in First In First Out registers (FIFOs). Thus, the architecture according to the present invention implements hardware that effects discrimination between header/frame information and user data, and effects stripping and storage of header/frame information apart from user data processed for storage in cache.
In further accord with the invention, user data processed for storage in cache, which is effectively concatenated by stripping the frame/header information and storing the separated user data in the FIFOs, is subjected to enhanced error detection. The protocol engine is configured to compute Cyclic Redundancy Check (CRC) error detection information on the separated (concatenated) user data after stripping the frame/header information. Thus although any CRC data transferred from the host with the data frame is stripped, CRC information is computed for the separated data at a point close to the physical interface delivering the data to the ESCON interface according to the invention. The integrity of the data can therefore be tested again as the data moves through the pipeline(s).
Additionally, data integrity is further ensured by checking the integrity of the simplified pointer management implemented upon entering the user data in the FIFOs. The order, and thus integrity, of data moving through the FIFOs is checked by an enhanced parity mechanism. The user data is stored as bytes in the FIFOs which are nine bits wide to facilitate a parity bit, as known in the art. This parity check mechanism is enhanced by a pseudo-random number generated in a linear feedback shift register (LFSR) in the protocol engine field programmable gate array (FPGA). The pseudo-random number is used to determine whether the parity should be odd or even for that byte when the data bytes exit the FIFOs.
An assembler/disassembler mechanism in each pipeline according to the invention receives data from FIFOs (on a write operation to GM), and transfers data to FIFOs (on a read operation from GM). A substantially identical enhanced parity mechanism as implemented in the protocol engine is implemented in the assembler/disassembler to generate a pseudo-random number for each data byte. The enhanced parity accompanying data transferred from the FIFO (on a write to GM) is exclusive-ORed with LFSR data generated by enhanced parity circuitry in the assembler/disassembler which effectively reproduces the original parity for the data. If the pointers in the FIFOs become disjoined so that data is essentially being read out-of-sequence from the FIFOs, the condition will be detected in a significant number of the cases as it shows up as a parity error. Thus addressing errors in the FIFOs will manifest themselves as parity errors if the data is stored in the FIFOs with opposite parity than what is detected at the assembler/disassembler (on a write to GM), or vice versa (on a read from GM). Data transferred from the assembler/disassembler to the FIFO (on a read from GM) is processed using the enhanced parity mechanisms in substantially the same manner.
The assembler/disassembler mechanism according to the invention is more feature rich than known assembler/disassemblers such as described in the referenced U.S. patent application. The present assembler/disassembler incorporates registers that provide a pathway for the respective line processor to get access into the data string, and allow the processor to access the GM, effectively replacing the known MDR pipeline.
The assembler/disassembler also contains a counter mechanism that counts/controls the amount of data that is read into and out of the FIFOs. When the data transfer starts, the line processor can set how much data will be transferred to memory and from memory in and out of the FIFOs, which allows the entire data transfer to effectively run independent of the line processor. When the transfer count is exhausted/reached, the pipeline hardware can flush the data to memory and initiate end of transfer checks. The data can be tagged with the additional error detection and information, i.e. control status information including CRC information can be updated and/or appended to the data progressing through the pipeline. This effects an additional level of error detection by ensuring at the assembly/disassembly part of the transfer that all of the expected data was transferred into or out of the FIFO (in agreement with a count in the Middle Machine described hereinafter). Other mechanisms in the pipe, e.g. the Middle Machine, can then be most expeditiously informed whether a complete transfer was effected.
The assembler/disassembler according to the invention is configured to assemble or accumulate burst data, as opposed to moving one piece of data at a time, for transfer on the data bus en route to the GM. Thus when a threshold quantity of data is accumulated for an efficient transfer (on a write to GM), the assembler/disassembler passes the burst of data along the pipeline. Similarly, a burst transfer is effected on a read from GM, when a threshold level of space is available in the assembler/disassembler for receipt of data for an efficient transfer.
A buffer dual port ram (DPR) is configured to receive data for buffering read operations from and write operations to the GM. Data is transferred between the assembler/disassembler and the buffer DPR efficiently with a maximum burst size of sixteen, 64 bit memory words. The DPR is used to buffer and speed-isolate the different transfer rates of the GM and backplane to/from the data bus and various data paths on the ESCON interface circuit board. Data transfers between the assembler/disassembler and the buffer DPR pass through Error Detection And Correction circuitry (EDAC) implementing a modified hamming algorithm as known in the art.
According to the invention, a plurality of state machines arranged into three functional units, an Upper Machine, Middle Machine and a Lower Machine facilitate movement of user data between the buffer DPR and the Global Memory (GM).
The Middle Machine controls all data movement to and from the GM. Although not directly in the data path, it is responsible for coordinating control between the various elements that comprise the data transfer channels. The Middle Machine is interconnected to the Upper Machine and Lower Machine, and to various global memory and line processor interface, address and control elements.
The Middle Machine provides control and coordination between the Upper and Lower sides of the buffer DPR. The Lower Machine connects to the assembler/disassembler mechanism of each pipe and controls the EDAC. The Upper Machine connects to the backplane, which in turn connects to Global Memory. The actual data transfers between the buffer DPR and GM are controlled by the Upper Machine, and transfers between the buffer DPR and the assembler/disassembler are controlled by the Lower Machine.
On the Lower side, the Lower Machine handles the handshake protocol between the assembler/disassembler, the EDAC and the DPR. When an assembler/disassembler needs to transfer data, it sends a request to the Middle Machine. The Middle Machine arbitrates among the plurality of Lower side requesters, and begins a Lower grant cycle. The Middle Machine provides the number of the pipe that won the arbitration (the xe2x80x9cwon pipe numberxe2x80x9d), along with a Dual Port RAM pointer to the Lower Machine. When the Lower Machine accepts this data, it sends an acknowledge back to the assembler/disassembler which won the arbitration, and begins a data transfer between the assembler/disassembler and the DPR, through the EDAC. Upon completion of the transfer, status and ending pointer information are returned to the Middle Machine.
Similarly, the Upper Machine implements the backplane protocol to accomplish data transfers between the Global Memory connected to the backplane and the buffer DPR on the ESCON interface according to the invention. The Middle Machine determines a pipe that needs servicing. When the Middle Machine finds a candidate it provides the won pipe number along with a DPR pointer to the Upper Machine. The Upper Machine requests backplane access and provides the address, command and transfer length to the backplane. When the data transfer phase begins, the Upper Machine manages the DPR address and control signals. Upon completion of the transfer, status and ending pointer information are returned to the Middle Machine Status Register.
The Middle Machine maintains a transfer counter, which is used to pace a transfer and to calculate appropriate backplane burst sizes. The Middle Machine is further responsible for error management. When an error occurs during a transfer (either Upper or Lower), the Middle Machine will collect status from the Upper and Lower machines, appropriately post the error condition in the Status Register, and disable the failing transfer.
The architecture according to the invention further incorporates substantial shared resources including a service processor that provides, among other things, configuration and maintenance services. The service processor facilitates messaging between line processors and provides a single point of contact for a user interfacing with the line processor(s). There is also shared EPROM or FLASH memory from which the service processor and line processors get their initialization or boot program. Shared non-volatile static RAM (SRAM) is provided to store the ESCON interface board configuration data and trace logs. A shared time of day chip records the actual time and date of the real world that is shared between all the processors.
The shared resources include shared Dynamic RAM (DRAM) and Static RAM (SRAM)(xe2x80x9cshared memoryxe2x80x9d). The DRAM can be used to store code, such as diagnostic code, that is not performance critical. When an error is received in a pipeline diagnostics may be run by the processor in that line. Since the diagnostic code is the same to all processors on the board there is no reason to keep multiple copies of it, so it is kept in the shared memory. The shared memory may also be used to store portions of code that is too large to store locally with the respective line processor.
The shared memory SRAM is further used to provide the ability for the line processors and service processor to communicate via a messaging system. The shared memory includes mailboxes that are used to communicate between the line processors and the service processor. The service processor has the capability of issuing a system management interrupt to any or all of the line processors. This interrupt is an indication to the line processor(s) that it should go out to the shared memory and read its respective mailbox. In operation, the service processor can deliver a message, i.e. command, to a line processor""s mailbox, for example to tell a line processor to go off-line or on-line. The service processor will write the command into the mailbox and then assert the system management interrupt on the appropriate line processor that it wants to read the mailbox. The line processor receiving the interrupt will take the interrupt vector, read the mailbox, interpret the command and deliver the appropriate response to the mailbox. Each line processor has a system management interrupt line going back to the service processor. Once the line processor has put its response or its request into a mailbox, it will then assert its system management interrupt line to the service processor. The service processor is central to the interrupt/messaging scheme in that it can interrupt any line processor and any line processor can interrupt the service processor.
The shared resources including the service processor are used at start-up to boot the ESCON interface, and ultimately control the functional of the line processors. The service processor also provides system wide services. When there is a system wide interrupt it is delivered to the service processor. The service processor decides what to do with it. It decides whether it needs to interrupt the data transfer that is in progress. While line processors are responsible for doing data transfer tasks, they can consolidate certain tasks onto the service processor to increase system performance. That is, the service processor can consolidate some of the operations from each of the line interfaces to reduce the number of accesses to the backplane thus saving backplane bandwidth. This is accomplished through the messaging system between the processors that is implemented in shared memory.
Features of the invention include provision of an ESCON I/O interface that effects high-speed, high efficiency transfer of data between an ESCON I/O host and Global Memory (GM) in an integrated cache disk array system. Separation of control information and user data early in the pipeline transfer facilitates highly efficient processing and simplified manipulation of user data traversing the pipeline(s). Segregation of user data into a FIFO avoids complex pointer management. Higher levels of data integrity are ensured by enhanced error detection implemented in the initial stages of the data traversing the pipes. Data order checking provides further enhancement of error detection ensuring the integrity of transferred data.
Further features include increased pipelining and data transfer rates facilitated by the assembler/disassembler mechanism which assembles data in order to burst data to the buffer DPR. Bursts of data are more efficient and yield higher performance than doing single beat data transfers. The implementation of upper, lower and middle machines to coordinate data transfers modularizes data management and facilitates pipelining and parallelism.
Avoidance of multiple copies of the code for each processor in each pipeline avoids wasting resources, such as EPROM or FLASH, and conserves board real estate. The sharing of substantial resources by the line processors keeps hardware requirements to a minimum. High levels of integration are achieved in the interface according to the invention as various mechanisms are implemented in very large scale application specific integrated circuits (ASICs) and programmable gate arrays. Thus fewer devices are used and higher reliability is attained.
Separation of facilities and code separates tasks for line processors and the service processor which simplifies source code modifications and maintenance. Maintenance and error detection and correction handled by the service processor minimizes affects on data transfers, as line processors handle data transfers maximizing data throughput and bandwidth. Implementation of a processor messaging system which can be used to prioritize transfers based on most desired operations improves overall system performance.