A storage array or disk array is a data storage device that includes multiple hard disk drives (HDDs) or similar persistent storage units. A storage array can allow large amounts of data to be stored in an efficient manner. A server or workstation may be directly attached to the storage array such that the storage array is local to the server or workstation. In cases in which the server or workstation is directly attached to the storage array, the storage array is typically referred to as a direct-attached storage (DAS) system. Alternatively, a server or workstation may be remotely attached to the storage array via a storage array network (SAN). In SAN systems, although the storage array is not local to the server or workstation, the disk drives of the array appear to the operating system (OS) of the server or workstation to be locally attached.
FIG. 1 illustrates a block diagram of a typical data storage system 2 that implements a command-pull model. The system 2 includes a host system 3, a memory controller 4, and a peripheral interconnect (PCI) or PCI Express (PCIe) bus 5. The controller 4 includes a central processing unit (CPU) 6, a memory device 7, and an I/O interface device 8. The I/O interface device 8 is configured to perform data transfer in compliance with known data transfer protocol standards, such as the Serial Attached SCSI (SAS) standard, the Serial Advanced Technology Attachment (SATA) standard, or the Nonvolatile Memory Host Controller Interface Express (NVMe) standard. The I/O interface device 8 controls the transfer of data to and from multiple physical disks (PDs) 9. The memory controller 4 communicates via the PCI bus 5 with a system CPU 11 and a system memory device 12. The system memory device 12 stores software programs for execution by the system CPU 11 and data. A portion of the system memory device 12 is used as a command queue 13.
During a typical write action, the system CPU 11 runs a memory driver software stack 14 that stores commands and data in the command queue 13. When the memory driver 14 stores a command in the command queue 13, it notifies the memory controller 4 that a command is ready to be executed. When the controller CPU 6 is ready to execute a command, it pulls the command, or multiple commands, and the associated data from the system queue 13 via the bus 5 and issues a completion interrupt to the host system 3. When the commands are executed by the memory controller 4, the controller CPU 6 causes the data associated with the commands to be temporarily stored in the controller memory device 7 and then subsequently written to one or more of the PDs 9 via the I/O interface device 8.
Historically, the performance of an HDD-based system of the type shown in FIG. 1 has been measured in terms of input/output (IO) per second (IOPS), and in some cases, megabytes per second (MB/s). Latency of such a storage system is typically given as: Latency_Overall=Latency_SW_Stack+Latency_Controller+Latency_HDD=1/IOPS, where Latency_Overall is the overall latency of the system, Latency_SW_Stack is the latency associated with the memory driver 14, Latency_Controller is the latency associated with the memory controller 4, and Latency_HDD is the latency associated with the PDs 9. Latency_SW_Stack is typically on the order of microseconds (10−6 seconds). Likewise, Latency_Controller is typically on the order of tens of microseconds. However, Latency_HDD is typically on the order of milliseconds (10−3 seconds) or tens of milliseconds. Approximately 99% of overall latency is due to extremely slow mechanical parts of the HDDs. Therefore, for practical purposes, Latency_SW_Stack and Latency_Controller can be ignored when determining system performance. In other words, system performance can be estimated as being equal to Latency_HDD.
Recently, there has been a transition from using magnetic HDDs as the PDs 9 to using solid state drives (SSDs), or a combination of SSDs and HDDs, as the PDs 9. In the industry, the use of SSD-based solutions is viewed as an evolution of HDD-based solutions. However, SSD-based solutions are approximately one hundred times faster and consume much less power than HDD-based solutions. This view of SSD-based solutions has led the industry to continue using the pre-existent, above-described pull methodology in SSD-based solutions to pull commands from the command queue into the memory controller. Also, because SSD-based solutions have been viewed in the industry as merely an evolution of HDD-based solutions, TOPS have been used as the performance metric for measuring system performance in storage systems that implement SSD-based solutions.
However, the differences between SSD-based solutions and HDD-based solutions are much greater than it appears, and traditional metrics should not be used to measure the performance of systems that implement SSD-based solutions. In a system that implements an SSD-based solution, the overall latency of the storage system is given as: Latency_Overall=Latency_SW_Stack+Latency_Controller+Latency_SSD=1/IOPS, where Latency_SW_Stack is the latency associated with the memory driver 14, Latency_Controller is the latency associated with the memory controller 4, and Latency_SSD is the latency associated with the SSDs that are used as the PDs 9. Unlike the latency of the HDDs, the latency of the SSDs is on the order of tens to hundreds of microseconds, e.g., generally in the range of 100 to 300 microseconds, and the latencies associated with the memory driver 14 and the memory controller 4 add much more than that to the overall latency. Therefore, in calculating the overall latency of the storage system that implements an SSD-based solution, Latency_SW_Stack and Latency_Controller should no longer be ignored.
The command-pull approach requires quite a bit of interaction between the memory driver 14 and the memory controller 4. This is a convenient approach in HDD-based systems in that it allows the fast operating system (OS) side of the host system 3 to be almost completely independent of the slower HDD-based controller side so that the OS side can pile up as many commands as possible in the queue 13 to provide greater queue depth (QD), which is very desirable and common in HDD-based solutions. The memory controller 4 can then pull the commands from the queue 13 at its own pace. While this method is very convenient in HDD-based solutions, it adds a large amount of extra latency due to the synchronization that is required between the memory controller 4 and the host system 3, and due to the fact that the memory controller 4 may pick up commands at times much later than when they were issued. In addition, if there is lot of work to be done by the memory controller 4, as is often the case, all of the command processing must compete with the rest of workload of the memory controller 4, which adds more latency to the overall latency.
The above-described command-pull model works very well for HDD-based solutions where adding 50 to 500 microseconds to a command that typically may take about 10,000 microseconds to complete is negligible given the other advantages that the method provides. However, the command-pull model does not produce acceptable results when used in a storage system that implements an SSD-based solution where the access time may be as low as 100 microseconds, or, in cases in which a dynamic random access memory (DRAM) write back (WB) buffer is used in the memory controller 4, as low as 1 to 5 microseconds.
Nevertheless, as indicated above, the overall latency of storage systems that implement SSD-based solutions is still being estimated as being equal to Latency_SSD, while leaving Latency_SW_Stack and Latency_Controller out of the estimation. For this reason, attempts at reducing the overall latency of storage systems that implement SSD-based solutions have focused primarily on reducing Latency_SSD rather than on reducing Latency_SW_Stack or Latency_Controller.
A need exists for a storage system that implements an SSD-based solution and that significantly reduces overall latency by significantly reducing one or both of Latency_SW_Stack and Latency_Controller.