1. Field of the Invention
The present invention relates to an apparatus for distributed source and destination queuing in a high performance memory based switch. This invention relates additionally to improvements in shared memory switches and methods for operating same, and more particularly, to improved methods and apparatuses for reducing a data path latency and inter-frame delay associated with time slicing and bit slicing shared memory switches.
2. Relevant Background
Mainframes, super computers, mass storage systems, workstations, and very high resolution display subsystems are frequently connected together to facilitate file and print sharing. Common networks and channels used for these types of connections oftentimes introduce communications bottlenecking, especially in cases where the data is in a large file format typical of graphically-based applications.
There are two basic types of data communications connections between processors and between a processor and peripherals a channel connection and a network connection. A xe2x80x9cchannelxe2x80x9d provides a direct or switched point-to-point connection between communicating devices. The channel""s primary task is merely to transport data at the highest possible data rate with the least amount of delay. Channels typically perform simple error correction in hardware. A xe2x80x9cnetwork,xe2x80x9d by contrast, is an aggregation of distributed nodes (e.g., workstations, mass storage units) with its own protocol that supports interaction among these nodes. Typically, each node contends for the transmission medium, and each node must be capable of recognizing error conditions on the network and must provide the error management required to recover from the error conditions.
One type of communications interconnect that has been developed is Fibre Channel. The Fibre channel protocol was developed and adopted as the American National Standard for Information Systems (ANSI). See Fibre Channel Physical and Signaling Interface, Revision 4 2, American National Standard for Information Systems (ANSI) (1993) for a detailed discussion of the fibre channel standard. Briefly, fibre channel is a switched protocol that allows concurrent communication among workstations, super computers and various peripherals. The total network bandwidth provided by fibre channel is on the order of a terabit per second. Fibre Channel is capable of transmitting frames at rates exceeding 1 gigabit per second in both directions simultaneously. It is also able to transport commands and data according to existing protocols such as Internet protocol (IF)1 small computer system interface (SCSI), high performance parallel interface (HIPPI) and intelligent peripheral interface (IPI) over both optical fiber and copper cable.
FIG. 1 illustrates a variable-length frame 11 as described by the Fibre Channel standard. The variable-length frame 11 comprises a 4-byte start-of-frame (SOF) indicator 12, which is a particular binary sequence indicative of the beginning of the frame 11. The SOF indicator 12 is followed by a 24-byte header 14, which generally specifies, among other things, the frame source address and destination address as well as whether the frame 11 is either control information or actual data. The header 14 is followed by a field of variable-length data 16. The length of the data 16 is to 2112 bytes. The data 16 is followed successively by a 4-byte CRC (cyclical redundancy check) code 17 for error detection, and by a 4 byte end-of-frame (EOF) indicator 18. The frame 11 of FIG. 1 is much more flexible than a fixed frame and provides for higher performance by accommodating the specific needs of specific applications.
FIG. 2 illustrates a block diagram of a representative fibre channel architecture in a fibre channel network 100. A workstation 120, a mainframe 122 and a super computer 124 are interconnected with various subsystems (e.g., a tape subsystem 126, a disk subsystem 128, and a display subsystem 130) via a fibre channel fabric 110 (i.e. fibre channel switch). The fabric 110 is an entity that interconnects various node-ports (N_ports) 140 and their associated workstations, mainframes and peripherals attached to the fabric 110 through the F_ports 142. The essential function of the fabric 110 is to receive frames of data from a source N_port and, using a first protocol, route the frames to a destination N_port. In a preferred embodiment, the first protocol is the fibre channel protocol. Other protocols, such as the asynchronous transfer mode (ATM), could be used without departing from the scope of the present invention.
Essentially, the fibre channel is a channel-network hybrid, containing enough network features to provide the needed connectivity, distance and protocol multiplexing, and enough channel features to retain simplicity, repeatable performance and reliable delivery. Fibre Channel allows for an active, intelligent interconnection scheme, known as a xe2x80x9cfabric,xe2x80x9d or fibre channel switch to connect devices. The fabric includes a plurality of fabric-ports (F_ports) that provide for interconnection and frame transfer between a plurality of node-ports (N_ports) attached to associated devices that may include workstations, super computers and/or peripherals. The fabric has the capability of routing frames based upon information contained within the frames. The N_port manages the simple point-to-point connection between itself and the fabric. The type of N_port and associated device dictates the rate that the N_port transmits and receives data to and from the fabric. Transmission is isolated from the control protocol so that different topologies (e.g., point-to-point links, rings, multidrop buses, cross point switches) can be implemented.
The Fibre Channel industry standard also provides for several different types of data transfers. A class 1 transfer requires circuit switching, i.e., a reserved data path through the network switch, and generally involves the transfer of more than one frame, oftentimes numerous frames, between two identified network elements. In contrast, a class 2 transfer requires allocation of a path through the network switch for each transfer of a single frame from one network element to another. Frame switching for class 2 transfers is more difficult to implement than class 1 circuit switching as frame switching requires a memory mechanism for temporarily storing incoming frames in a source queue prior to their routing to a destination port, or a destination queue at a destination port. A memory mechanism typically includes numerous input/output (I/O) connections with associated support circuitry and queuing logic. Additional complexity and hardware is required when channels carrying data at different bit rates are to be interfaced.
It is known to employ centralized queuing. Centralized queuing is inherently slow, as a common block of logic must be employed for all routing decisions within the switch.
It is also known to employ distributed source queuing, which has apparent disadvantages when the frame at the head of the queue is destined to a port that is already forwarding a frame such that the path is blocked and the frame cannot be transferred. Alternatively, it is known to employ distributed destination queuing, which has the apparent disadvantage of a large destination queue at each port, since it is possible for all frames within the switch to be simultaneously queued to the same destination port.
Another disadvantage of distributed destination queuing is apparent when the frame at the end of the head of the queue is sourced from a port that is already forwarding a frame such that the path is blocked and the frame cannot be transferred.
Thus, a heretofore unaddressed need exists in the industry for new and improved systems for implementing the Fibre Channel industry standard for transfers on fiber optic networks with much higher performance and flexibility than presently existing systems. Particularly, there is a significant need for a method and apparatus that combines both distributed source and destination queuing in a high performance memory based switch. A need also exists to implement distributed queues between the source and destination ports, requiring the lower queue storage resources of source queuing, but providing the high throughput of destination queuing and avoiding xe2x80x9chead-of-linexe2x80x9d blocking of either source or destination queuing.
It would be desirable and of considerable advantage to provide a Fibre Channel switch that provides for efficient transfer of queuing information between Fibre Channel ports, especially if the new switch provides an improvement in any of the following areas: increased bandwidth, decreased no-load latency, and increased throughput under load (due to parallelism of distributed queuing).
It will be apparent from the foregoing that there is still a need for a high bandwidth memory-based switch employing distributed queuing that differs from that employed in existing centralized Fibre Channel switch architectures. In addition there is a need for a method and apparatus for reducing the data path latency and the minimum inter-frame delay normally associated with time slicing and bit slicing shared memory switches.
In light of the above, therefore, it is an object of the invention to provide an improved shared memory switch and method for operating same.
It is another object of the invention to provide a method and apparatuses for reducing data path latency and inter-frame delay associated with time slicing and bit slicing shared memory switches.
These and other objects, features and advantages of the invention will be apparent to those skilled in the art from the following detailed description of the invention, when read in conjunction with the accompanying drawings and appended claims.
Thus, in accordance with a broad aspect of the invention, a method is presented for operating a time slicing and bit slicing shared memory switch. The method includes receiving a plurality of data frames in a respective plurality of input channels to the switch. The plurality of data frames are applied to a shared memory in a time sliced manner. The time slice process is arranged so that a time slice for each section of a shared memory is staggered so that on any clock cycle, one memory portion is accessed for writing at least some of the data frames to the memory and on a next clock cycle the memory portion is accessed for reading at least a portion of the data from the memory.
According to another broad aspect of the invention, a method is presented for reducing data path latency and an inter-frame delay associated with time slicing and bit slicing shared memory switches. The method includes receiving a respective plurality of data frames and locations in a partitions that are associated with the plurality of data frames. Corresponding ones of the data frames are applied to respective memory partitions identified as a function of a time slice number, wherein data is applied to the partitions in a time sliced manner, and wherein a time slice for each section of a shared memory is staggered so that on any clock cycle, one memory partition is being accessed for writing of at least one of the data frames and on a next clock cycle the one memory portion may be accessed for reading at least a portion of the data from the memory.
According to still another broad aspect of the invention, an apparatus is presented for reducing data path latency and an inter-frame delay associated with time slicing and bit slicing shared memory switches. The apparatus includes a bus for receiving a plurality of data frames in a respective plurality of input channels to the switch. A slice crosspoint applies the plurality of data frames to a shared memory in a time sliced manner. The time slice is established for each section of a shared memory to be staggered so that on any clock cycle, one memory portion is being accessed for writing at least some of the data frames and on a next clock cycle the memory portion is accessed for reading at least a portion of the data.
According to yet another broad aspect of the invention, an apparatus is provided for reducing a data path latency and an inter-frame delay of a time slicing and bit slicing shared memory switch. The apparatus includes a plurality of memory write data buses for receiving a respective plurality of data frames and a plurality of memory write address buses for supplying locations in memory partitions associated with the plurality of data frames. A address slice crosspoint identifies memory partitions by a time slice number, identified by portions of the addresses received from the memory write address buses. A data slice crosspoint applies corresponding ones of the data frames to respective memory partitions identified by a corresponding time slice number by the address slice crosspoint. In operation, data is applied to the partitions in a time sliced manner by which a time slice for each section of a shared memory is staggered so that on any clock cycle, one memory partition is being accessed for writing of at least one of the data frames and on a next clock cycle the one memory portion may be accessed for reading at least a portion of the data from the memory.