The present invention concerns a scheme for the connection of nodes in general, and the connection of nodes in an InfiniBand configuration in particular.
The PCI bus standard is wide spread. Many of today""s computer systems and switches have a PCI bus and there is a huge variety of input/output (I/O) devices and PCI adpaters available for such devices. A schematic representation of typical PCI system 10 is illustrated in FIG. 1.
The InfiniBand standard is an example of a standard that concerns system area networks (SAN) connecting nodes, e.g., input/output (I/O) devices, within a distributed computer system. According to the Infiniband standard, there is an InfiniBand fabric through which the I/O devices are connected. InfiniBand is a common I/O specification that delivers a channel-based, switched fabric technology that is designed for adoption by the industry.
An example of an InfiniBand configuration is depicted in FIG. 2. As illustrated in FIG. 2, an InfiniBand system 30 typically comprises three different kind of devices, namely, an InfiniBand fabric 20, a host system 21 with a host channel adapter (HCA) 22 and an input/output unit (IOU) 23 with an InfiniBand target channel adpater (TCA) 24. The IOU 23 may comprise one or more I/O devices 31, 32 and is attached to the InfiniBand fabric 20 via the TCA 24. A TCA is a component that terminates the SAN in an I/O device that requires support only for capabilities appropriate to the respective I/O device. The HCA terminates the SAN in a host. It requires support for the ability to communicate with I/O devices in TCAs and to implement inter-processor communication (IPC) with other HCAs. An InfiniBand configuration comprises a software service package (SSP) that is responsible for the TCA initialization, connection establishment, management, and the service provision for device drivers. The SSP is installed on the host system.
Such a basic InfiniBand system 30 can be expanded by connecting additional host systems 25 and 26, and/or IOUs 27 and 28. The transition from PCI-based systems to InfiniBand-based ones will take some time and investment since the existing PCI devices have to be replaced step-by-step with new InfiniBand native devices.
Initially, most vendors and commercial users will want to attach the existing PCI devices to the InfiniBand fabric through lnfiniBand-to-PCI bridges. This poses many difficulties because different hosts can be attached to a number of different devices sharing the same PCI bus. Therefore, the bridges will have to distinguish between addresses that are posted on the PCI bus in order to enable the bridge to translate the address to the appropriate InfiniBand transaction and to send it to the respective host.
It is an object of the present invention to provide a method and apparatus for an address translation on a PCI bus over a network, such as an InfiniBand network. The scheme presented herein delivers a cost-effective solution in that it allows legacy I/O devices like PCI and PCI-X devices to be connected via the InfiniBand to hosts. With the present invention, one can realize communication systems, which allow a multitude of I/O devices, host processor nodes, and I/O platforms to be connected in high bandwidth, low latency, scalable environment.
The present invention concerns methods and systems enabling PCI devices to read data via an InfiniBand network from a memory region of a host system or to write data to a memory region on a host system via the InfiniBand network. The PCI devices are attached to a PCI bus and connected to the InfiniBand network via a target channel adpater that translates PCI bus transactions and interrupts into InfiniBand requests and that translates InfiniBand requests to PCI transactions. Each PCI device that is attached to the PCI bus has a PCI address range associated with it. According to the present invention, a PCI memory window is allocated on the target channel adpater, the PCI memory window being assigned to the host system. A pseudo address that belongs to the target channel adpater is posted on the PCI bus when reading data via the InfiniBand network from a host system or when writing data on a host system via InfiniBand, the pseudo address comprising a base part (VABase) and an offset part (Offset). The base part (VABase) is used to identify the PCI memory window being assigned to the host system and the offset part (Offset) is used for calculating a virtual address (VA) specifying a physical memory location at the host system.
Advantages of the present invention are addressed in connection with the detailed description or are apparent from the description.