The present disclosure relates generally to information handling systems, and more particularly to the hot plugging of devices in an information handling system.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Information handling systems sometime utilize devices that may be “hot-plugged”, “hot-swapped”, “hot-inserted”, and/or otherwise communicatively coupled to the system without shutting down or stopping the system. For example, server devices often utilize hot-plug devices for storage. The conventional hot-plugging of some single communication lane storage devices such as, for example, Serial Attached Small Computer System Interface (SCSI) (SAS) storage devices and Serial AT Attachment (SATA) storage drives, is relatively stable. However, the hot-plugging of multiple communication lane storage devices has been found to raise some issues. For example, Peripheral Component Interconnect (PCI) express (PCIe) Non-Volatile Memory storage devices (NVMe storage devices) experience several issues when utilizing conventional hot-plugging techniques.
Such NVMe storage devices typically include securing latches that are rotatably coupled to one end of the front face of the NVMe storage device, with those securing latches also coupled to a securing subsystem on the NVMe storage device that is configured to engage the server chassis and/or connector to which the NVMe storage device is being connected. Conventionally, when the securing latch is rotated away from the front face of the NVMe storage device such that the securing subsystem is unactivated, the NVMe storage device may be positioned in the storage device housing that is defined by the server system and located adjacent a server system connector. The securing latch may then be rotated towards the front face of the NVMe storage device in order to activate the securing subsystem to engage the server system and/or the sever system connector and mate the storage device connector on the NVMe storage device with the server system connector in the server system.
It has been found that the use of such securing latches to mate the storage device connector with the server system connector results in a sequential engagement of the pins on those connectors that can cause issues with the subsequent use of the NVMe storage device. For example, as the securing latch is rotated towards the front face of the NVMe storage device, the pins on the server system connector that are closest to the point of contact between the securing subsystem and the server system/server system connector (e.g., adjacent the fulcrum/rotatable coupling of the securing latch with the front face of the NVMe storage device) engage corresponding pins on the storage device connector first, followed sequentially by the pins that are further away from that point of contact. The server device connector includes four sets of communication lane pins that are distributed across its length, with the fourth communication lane pins located closest to that point of contact, and the first communication lane pins located further away from the point of contact between the securing subsystem and the server system/server system connector. Furthermore, the server system connector includes reset pin (e.g. a PERst pin) that is located approximately midway along the length of the server system connector, and a pair of reference clock pins that are located adjacent the end of the server system connector that is opposite the point of contact between the securing subsystem and the server system/server system connector.
In one example of issues resulting from such device hot-plug systems, the engagement of the storage device connector with the server system connector results in engagement of the storage device connector with power pins on the server system connector, along with the sequential engagement of the fourth communication lane pins, followed by engagement of the third communication lane pins, followed by engagement of the second communication lane pins, and followed by engagement of the first communication lane pins. In conventional systems, the NVMe storage device is configured with a 70 millisecond reset time that starts when the NVME storage device receives power from the server system connector. However, in some embodiments, prior to all of the sets of communication lane pins engaging, the reset time for the NVMe storage device may expire, causing problems with the operation of the NVMe storage device.
For example, in some situations the closing of the securing latch may cause the NVMe storage device to engage the fourth communication lane pins and the third communication lane pins, followed by engagement of the NVMe storage device with the reset pin after the expiration of the reset time for the NVMe storage device, which results in the reset of the NVMe storage device being driven by the engagement of the reset pin. Because the server system is in normal operating mode upon that engagement (i.e., the NVMe storage device is being hot-plugged to the server system), the reset pin is being de-asserted prior to the NVMe storage device engaging each of the second communication lane pins, the first communication lane pins, and the reference clock pins. This may cause NVMe storage devices that are configured to perform “lane reversal” to convert the fourth communication lane to a first communication lane (and the third communication lane to a second communication lane in some embodiments), and ignore any other communication lanes that would otherwise be enabled by the subsequent engagement of the first and second communication lane pins. As such, in this example, the NVMe storage device will not operate at its maximum throughput (i.e., the NVMe storage device is capable of communicating via four communication lanes, but only one or two communication lanes are enabled.) Furthermore, in NVMe storage devices that are not configured to perform “lane reversal”, the NVMe storage device may not enable any communication lanes if an unexpected communication lane is detected (e.g., the fourth communication lane is detected when the first communication lane is expected), causing the NVMe storage device to be completely unusable (e.g., having no communication lanes available.)
In another example of issues resulting from such device hot-plug systems, the sequential engagement of the storage device connector with the server system connector results in the sequential engagement of the NVMe storage device with the reset pin, followed by engagement with the reference clock pins as discussed above. In some situations, the engagement of the NVMe storage device with the reset pin occurs prior to the reference clock (which is connected to the reference clock pins) reaching stability (e.g., 100 μs after engagement of the reference clock pins), and can cause the NVMe storage device to exit reset and enter its operational/functional state (i.e., the NVMe storage device will exit reset when the reset pin is engaged because the server system is not in reset.) The provisioning of the de-assertion signal on the reset pin while the reference clock is in the unstable state causes a violation (e.g., a Tperst-clk violation) of the PCIe electromechanical specification, and puts the NVMe storage device into a non-deterministic state due to the pre-reference-clock-stability de-assertion signal preventing its state machines from entering their default state (i.e., the state machines utilize the asynchronous de-assertion of the reset pin following reference clock stability to return to their default states.)
Accordingly, it would be desirable to provide an improved device hot-plug system.