Technical Field
The disclosure relates to a Peripheral Component Interconnect Express (referred to as PCIe hereinafter) device and also relates to a PCIe network system with fail-over capability and an operation method thereof.
Description of Related Art
Peripheral Component Interconnect Express (PCIe) is the industry standard computer expansion technology developed by the PCI Special Interest Group (PCI-SIG). PCIe was initially designed as a local bus interconnect technology for connecting the CPU, GPU, and I/O devices in a machine, and then developed to become a completely mature switched network featuring point-to-point links, hop-by-hop flow control, end-to-end retransmission, and so on. PCIe may also be used as the passive backplane interconnect among boards and an expansion interface for connecting the machine and an external apparatus (e.g. storage box).
A PCIe network is a switched network with serial point-to-point full duplex lanes. A PCIe device is connected to the PCIe network through a link formed by one or more lanes. Recently, expanded PCIe which uses a PCIe interface to interconnect multiple servers or virtualized I/O devices has become an interesting possibility. For example, application of PCIe may be further expanded to intra-rack interconnect. A PCIe switch may replace a standard top of rack (TOR) Ethernet switch. That is, PCIe may connect multiple hosts (e.g. servers) in one rack. The I/O devices that are allowed to be connected to the PCIe switch may be shared by all the servers in the same rack. All the servers in the rack may also communicate with each other through PCIe links.
Extension of PCIe to the multi-server environment also brings new challenges. A main limitation to the traditional PCIe architecture is that, at any time point, each PCIe domain has only one active root complex. As a result, it is not allowed to have two servers coexisting in the same PCIe domain. In order that PCIe can become a feasible system for communication and interconnection between the hosts in the rack, an additional fail-over mechanism is needed, so as to ensure that the network operation can continue when any control plane or data plane fails.