1. Field of the Invention
The present invention generally relates to input/output (I/O) data transmission devices, and more particularly to first-in-first-out (FIFO) buffer devices in I/O data transmission paths.
2. Description of the Related Art
InfiniBand (registered Trademark of the InfiniBand Trade Association, Portland, Oreg.) architecture is a new common I/O specification to deliver a channel based, switched-fabric technology that the entire hardware and software industry can adopt. A network and components associated with an InfiniBand network 100 are shown in FIG. 1a. InfiniBand based networks are designed to satisfy bandwidth-hungry network applications, such as those combining voice, data, and video on the Internet. InfiniBand architecture is being developed by the InfiniBand Trade Association that includes many hardware and software companies. Its robust layered design enables multiple computer systems and peripherals to work together more easily as a single high-performance and highly available server.
Being a fabric-centric, message-based architecture, InfiniBand is ideally suited for clustering, input/output extension, and native attachment in diverse network applications. InfiniBand technology can be used to build remote card cages 15 or connect to attached hosts 35, routers 40, or disk arrays 50. InfiniBand also features enhanced fault isolation, redundancy support, and built-in failover capabilities to provide high network reliability and availability. Featuring high-performance and reliability, these devices provide solutions for a range of network infrastructure components, including servers and storage area networks.
In FIG. 1b, a block diagram is shown in exemplary form of InfiniBand components in a portion of the network shown in FIG. 1a. These components have input/output interfaces, each forming part of a target channel adapter (TCA) 10, host channel adapter (HCA) 20, an interconnect switch device 30, and routers 40, each that have application specific integrated circuits (ASIC) core interfaces that include InfiniBand Technology Link Protocol Engine (IBT-LPE) cores that connect ASICs between each of these components through links 25 in an InfiniBand Technology (IBT) network 100. The IBT-LPE core supports a range of functionality that is required by all IBT devices in the upper levels of the physical layer and the lower link layer. It also handles the complete range of IBT bandwidth requirements, up to and including a 4-wide link operating at 2.5 gigabits per second. The IBT-LPE core (large integrated circuit design) in the upper levels of the physical layer and the link layer core of the ASIC comply with standards established by the InfiniBand Trade Association in the IBTA 1.0 specifications (2001). Such architectures decouple the I/O subsystem from memory by using channel based point to point connections rather than shared bus, load and store configurations.
The TCA 10 provides an interface for InfiniBand-type data storage and communication components. Creating InfiniBand adapters that leverage the performance benefits of the InfiniBand architecture is accomplished through a cooperative, coprocessing approach to the design of an InfiniBand and native I/O adapter. The TCA 10 provides a high-performance interface to the InfiniBand fabric, and the host channel communicates with a host based I/O controller using a far less complex interface consisting of queues, shared memory blocks, and doorbells. Together, the TCA and the I/O controller function as an InfiniBand I/O channel deep adapter. The TCA implements the entire mechanism required to move data between queues and to share memory on the host bus and packets on the InfiniBand network in hardware. The combination of hardware-based data movement with optimized queuing and interconnect switch priority arbitration schemes working in parallel with the host based I/O controller functions maximizes the InfiniBand adapter""s performance.
The HCA 20 enables connections from a host bus to a dual 1X or 4X InfiniBand network. This allows an existing server to be connected to an InfiniBand network and communicate with other nodes on the InfiniBand fabric. The host bus to InfiniBand HCA integrates a dual InfiniBand interface adapter (physical, link and transport levels), host bus interface, direct memory target access (DMA) engine, and management support. It implements a layered memory structure in which connection-related information is stored in either channel on-device or off-device memory attached directly to the HCA. It features adapter pipeline header and data processing in both directions. Two embedded InfiniBand microprocessors and separate direct memory access (DMA) engines permit concurrent receive and transmit data-path processing.
The interconnect switch 30 can be an 8-port 4X switch that incorporates eight InfiniBand ports and a management interface. Each port can connect to another switch, the TCA 10, or the HCA 20, enabling configuration of multiple servers and peripherals that work together in a high-performance InfiniBand based network. The interconnect switch 30 integrates the physical and link layer for each port and performs filtering, mapping, queuing, and arbitration functions. It includes multicast support, as well as performance and error counters. The management interface connects to a management processor that performs configuration and control functions. The interconnect switch 30 typically can provide a maximum aggregate channel throughput of 64 gigabits, integrates buffer memory, and supports up to four data virtual lanes (VL) and one management VL per port.
FIG. 2 illustrates the core logic 210 that connects an InfiniBand transmission media 280 (the links 25 shown in FIG. 1b) to an application specific integrated circuit (ASIC) 240 (such as the TCA 10, the HCA 20, the switch 30, the router 40, etc. as shown in FIG. 1b). The core logic 210 illustrated in FIG. 2 is improved using the invention described below. The core logic 210 shown in FIG. 2 is not necessarily prior art and may not be generally known to those ordinarily skilled in the art at the time of filing of the invention. While the core logic 210 is shown as being separate from the ASIC 240 in FIG. 2, as would be known by one ordinarily skilled in the art, the core logic is generally part of the ASIC.
The receive and transmit data transmission media clock 280 may operate at a different frequency (e.g., 250 MHz +/xe2x88x92100 parts per million on the receive path and the core logic 210 transmit data path may operate at 250 MHz). Further, in turn, the core 210 may, operate at a different frequency compared to the ASIC 240 clock speed (e.g., 312 MHz).
To accommodate the different speeds of the data signals being handled, the core logic 210 includes a serialization portion 270 that includes serialization/deserialization units 225, 227. The structure and operation of such serialization/deserialization units is well known to those ordinarily skilled in the art and such will not be discussed in detail herein so as not to unnecessarily obscure the salient features of the invention.
The InfiniBand transmission media 280 is made up of a large number of serial transmission lanes that form the links 25. The receive serialization/deserialization units 225 deserialize the signals from the transmission media 280 and perform sufficient conversion to reduce the frequency to one that is acceptable to the core logic 210. For example, if the serialization/deserialization receive units 225 operate to deserialize 10 bits at a time, a 10-to-1 reduction occurs that reduces the 2.5 gigabit per second speed on the transmission media 280 into a 250 MHz frequency that is acceptable to the core logic 210.
The core logic 210 also includes a frequency correction unit 260. The frequency of the signal propagating along the transmission media 280 may not always occur at this wire speed, but instead may be slightly above or below the desired frequency (e.g. by up to 100 parts per million). This inconsistency in the frequency is transferred through the serialization/deserialization units 225. The frequency correction unit 260 includes FIFO buffers 261 that buffer the signal being output by the serialization/deserialization units 225 so as to provide the signal in a uniform 250 MHz frequency to the upper link layer logic 250.
The upper link layer logic 250 includes additional FIFO buffers 251 that convert the frequency of the signal output from the frequency correction unit 260 into a frequency that is acceptable to the ASIC 240. During transmission of a signal from the ASIC 240 to the transmission media 280, the process is reversed and the upper link layer logic 250 utilizes different FIFO buffers 253. Similarly, the serialization unit 270 uses other transmission serialization/deserialization units 227. Note that no correction is required by the frequency correction unit 262 for signals that are being transmitted to the transmission media 280 because the ASIC 240 generally produces a signal that does not need to be corrected.
One disadvantage of the core logic 210 shown in FIG. 2 is the large number of buffers 251, 253, 261 that are required by the upper link layer logic 250 and the frequency correction unit 260. These buffers use substantial circuit power and reduce operational speed of data being processed through the core logic 210. Therefore, there is a need to reduce the number of buffers within the core logic 210 to reduce this power usage and increase processing speed.
In view of the foregoing problems, the present invention has been devised. It is an object of the present invention to provide a parallel-serial architecture network that includes a transmission media and at least one processor connected to the transmission media by a core. The core provides communications between the transmission media and the processor.
The core includes a logic layer connected to the processor, serial lanes connecting the logic layer to the transmission media, and receive and transmit buffers within the serial lanes. The receive buffers correct for fluctuations in the transmission media and alter the frequency of signals being processed along the serial lanes.
The invention may also include serializer/deserializers within the serial lanes. The receive buffers and the transmit buffers are preferably elastic first-in, first-out (FIFO) buffers and the receive buffers and the transmit buffers are both external to the logic layer. The transmit buffers alter a frequency of signals being transferred from the layer logic to the transmission media while the receive buffers process signals being transferred from the transmission media to the logic layer. The xe2x80x9cprocessorxe2x80x9d can be a host channel adapter, a target channel adapter, or a interconnect switch of the network.
With the invention the receive buffers perform the functions that were previously performed by FIFO buffers 251 and FIFO buffers 261 in the structure shown in FIG. 2. Thus, the invention reduces the number of buffers within the core logic 210. This decrease in the number of buffers within the core logic 210 reduces power consumption, increases processing speed and decreases the chip area (e.g., footprint) consumed by the core logic 210.
Integration of frequency correction and frequency adjustment processes into the input receive elastic FIFOs 220 also enables the upper layer logic 250 to have clock frequencies that are greater than external components connected thereto. Thus, the invention moves the clock domain conversion to a lower logic level compared to the structure shown in FIG. 2.