Peripheral component interconnect express (Peripheral Component Interconnect Express, PCIe) is a high-performance system bus used on a computing and communication platform. The PCIe bus is widely applied in a system interconnected where a CPU and peripheral devices are interconnected, and serves as a core service channel in computing and storage devices. There are many types of peripheral devices interconnected with the CPU through a PCIe bus, such as a network adapter device or a solid state disk (Solid State Disk, SSD). Such devices are herein called PCIe endpoint devices uniformly.
The PCIe bus is widely applied as a bus interface of a server or a storage system. When the system runs normally, as required by online expansion and maintenance, PCIe endpoint devices need to be added or removed with power uninterrupted, which is known as hot-swapping. In the prior art, PCIe hot-swapping conforms to the following procedures: An operator sends a hot-swap request by pressing a button; a hot-swap controller knows a hot-swap event, and then notifies all drivers that will possibly access the PCIe endpoint device in the system to stop accessing the PCIe endpoint device, and offloads resources of the PCIe endpoint device that needs to be hot-swapped; and then the PCIe endpoint device is powered off, and the operator unplugs the PCIe endpoint device.
In the prior art, hot-swapping of the PCIe endpoint device requires notification in advance in order to ensure normal running of the system. However, in recent years, the PCIe bus evolves from intra-system interconnection into inter-system interconnection, peripherals such as external cables are being increasingly applied, the cables tend to drop abnormally, and the PCIe endpoint device may get offline abnormally without notification in advance. In addition, it is more and more common for a user to use a solid state disk SSD to directly access the system. For reasons such as user habits, the user may directly plug or unplug an SSD disk without a prior notification. However, for the afore-mentioned situation where the PCIe endpoint device gets offline suddenly and abnormally, if the CPU has sent an instruction of reading or writing the PCIe endpoint device, the relevant instruction will be always in a to-be-executed state; when instructions sent by the CPU for accessing the PCIe endpoint device accumulates to some extent, the CPU considers the entire system abnormal, and reports a machine check exception (Machine check exception, MCE) error and performs a resetting.