Storage devices, particularly Solid State Drives (SSDs), exhibit continuously-changing characteristics over time. SSDs may have unpredictable latency and/or bandwidth due to the underlying software (i.e., firmware) and/or hardware inside the SSD. For example, NAND flash memory may have a prolonged read/write latency due to read/write errors. Prolonged access latency (read/program/erase) due to cell wearing may also affect latency and/or bandwidth. Virtual abstraction of SSD resources—that is, different approaches such as polymorphic SSDs, open-channel SSDs, and lightNVM (a subsystem that supports open-channel SSDs), to name a few—make it hard to predict an SSD's performance characteristics. Finally, different cell densities—such as Single Level Cell (SLC), Multi-Level Cell (MLC), Three Level Cell (TLC), and Quadruple Level Cell (QLC), to name a few—may have different characteristics.
As such, dynamic latency and bandwidth monitoring/profiling are useful in datacenters to reduce unpredicted latency, which may potentially contribute to long-tail latency. To achieve such enhanced performance is very challenging because measurements are oftentimes complicated. For example, not only does approximating a fitting curve by randomly selecting measurement points require many measurements, but it is very hard to ensure a certain degree of guaranteed performance.
Having said that, the device has the best knowledge of itself. That is, the device's architectural construction supplies many hints on what may contribute to a saturated bandwidth. For example, the number of NAND channels, the number of controllers, the command queue depths, and the number of queues may be hints to estimate the number of requests or duration of measurement to acquire reliable performance data. But devices outside the SSD do not have meaningful access to this information.
A need remains for a way for an SSD to provide profiling information to devices outside the SSD.