Solid State Drives (SSDs) can utilize multiple flash memory chips organized using multiple channels. This architecture can result in data requests being satisfied in a different order than the order in which the requests were made. For example, if an SSD receives two data requests that can be satisfied using a first memory channel and then a third data request that can be satisfied using a second memory channel, the third data request can be returned from the appropriate flash memory chip before the second data request is satisfied.
This architecture introduces parallelism into the system, allowing more than one data request to be satisfied at a time. This parallelism can be beneficial: in general, data can be returned faster in such a parallel architecture than in a serial architecture where each request must be satisfied before the next request can be satisfied. As a result, the average response time of the SSD is improved.
But a bottleneck still exists in the system. All data requests and responses pass through the interface that exists between the SSD and the host computer that issued the data requests, and the host computer needs to process one datum before it can process the next datum. If that interface is busy sending one datum back to the host computer, that interface cannot be used to send another datum back to the host computer. Continuing the earlier example, if the interface is busy sending the data from the second data request back to the host computer, the interface cannot send the data from the third data request back to the host computer. Put another way, the time required to send data from the SSD back to the host computer includes the latency of the host computer in processing data it receives from the SSD. The data from the third data request can end up waiting a long time before it is sent back to the host computer. Thus, while parallelism can reduce the average response time of the SSD, parallelism can increase the worst case response time of the SSD.
A need remains for a way to improve the worst case response time of data requests.