The present invention generally relates to mass storage devices. More particularly, this invention relates to methods and Serial Advanced Technology Attachment (SATA) devices suitable for improving the efficiency of communication over a SATA bus.
SATA devices, such as hard disk drives (HDDs) and nonvolatile solid-state drives (SSDs), communicate with a host using the SATA protocol. To implement a large read command, the SATA device typically transmits the requested data to the host in multiple blocks (e.g., 8 k blocks).
FIG. 1 represents a situation that occurs when both the host and the SATA device are attempting to communicate with each other at the same time. In 100, both send the other X_RDY primitives to initiate a command or data transfer, respectively. The SATA protocol states that the host must back down in the event of a collision between X_RDY primitives (i.e., when both links wish to transmit an FIS (Frame Information Structure) simultaneously). Typically for SATA NCQ (Native Command Queuing) protocol, this is a collision between the host wanting to send a new command and the device wanting to send a DMA Setup FIS to start the data phase of a previously received command.
In 120, the host must back down when it detects that it has received an X_RDY primitive from the SATA device and therefore aborts a command that was about to be sent. In 130, the host completes the abort and responds to the incoming X_RDY primitive with an R_RDY primitive. The SATA device may then proceed to complete the data transfer.
As operating speeds of SATA devices have improved, it has been determined that this process results in a situation where, if the SATA device is too fast, it will tend to repeatedly win the above collision causing the SATA device to starve itself of commands received from its host. This will result (in the case of read commands) in a period in which the SATA bus is underutilized as once all the data phases are complete, there are no commands being processed in the SATA device, so no data may be transferred for at least a period of time equal to the read access latency of the SATA device.
To explain this situation in more detail, FIG. 2 represents how the bus activity proceeds with time where the host is continuously sending a stream of commands to read data. The host begins to send the commands at 200 and continues to send commands until it has completed a burst of commands, or the SATA NCQ limit (32 in the present version) is complete 210. The SATA device takes time to respond to the first command with a read latency 220. This time period will generally be much longer than a time taken to issue all the commands to fill the command queue of the SATA device. Therefore, this process gives rise to a time period where the bus is inactive 240, where the host is waiting for the SATA device to respond with data, and the SATA device has yet to fetch the data for the first command received.
At 230, the SATA device begins to return data to the host. The SATA device will continue to return data for as long as it has fetched data for commands that are outstanding. As the host begins to receive data for outstanding commands (data is acknowledged using a tag which identifies the command that was used to read the data), the host is aware that the SATA device's queue is no longer full and has spare slots in its queue. However, while the SATA device is transferring data, the host is unable to send commands to the SATA device due to the SATA protocol collisions discussed in FIG. 1. For example, at 250, the host attempts to send more commands but is unable to do so and has to abort each attempt. At 260, the SATA device ceases to return data as all data for the read commands in the queue has been fetched. By 260, the SATA device has no more outstanding commands in its queue and has finished transferring data. The host is now no longer forced to back down on its attempts to send commands and proceeds to send commands to re-fill the command queue of the SATA device. The SATA device will again suffer a latency before the data for the first of this batch of new commands is available for transfer back to the host.
In this way, the continuous sending of commands to the SATA device and transfer of data back to the host proceeds in a repeated cycle identical to the first cycle 280. This is inefficient as there is a period of bus inactivity for each cycle. By way of example, for a SATA bus operating at 6 Gbps, a command may be sent to the SATA device in about 2 μs (microseconds). The SATA device then takes about 1 μs to set up a DMA transfer, with the transfer itself (for 4 KB of data) taking about 7 μs. Each 4 KB transfer therefore takes a total of about 10 μs, making it theoretically possibly to support 100 K transfers per second (referred to in disk drive specifications as Input/Output Operations Per Second or IOPS). This assumes that the commands and transfers all occur back to back, with no gaps in between leading to inactivity on the bus.
However, with the scenario represented in FIG. 2, the time taken for the host to send 32 commands is about 64 μs. With hard disk drives the read command latency is measured in milliseconds, with a solid state drive this latency may be down to 70 μs. In solid state drives, this still gives rise to a bus inactivity period of about 6 μs (i.e., 70−64=6). This period of inactivity is repeated on each cycle, even where the host is continually sending commands and can do this faster than the SATA device can send data back. Therefore, for an ideal back-to-back command and data transfer cycle, it would take about 320 μs, but the SATA protocol behavior introduces this bus inactivity period which, even if it is as low as 6 μs for an SSD, it can still represent a performance degradation of a few percent.
Prior attempts to address this issue include adding a relatively small programmable delay between each block transfer to allow the host to send intra-command data to the SATA device. However, inserting a delay between each block transfer limits the maximum IOPS of the device and can significantly degrade the performance (throughput) of the SATA device.
Alternatively, U.S. Pat. No. 7,827,320 to Stevens discloses a method whereby when a SATA device is in the X_RDY (XRDY) state and receives a X_RDY primitive from the host, the SATA device sets a RX_RDY (RXRDY) tag to identify the collision. Then after the SATA device sends the next FIS to the host (in response to the R_RDY (RRDY) primitive), when the SATA device enters the IDLE state it checks whether the RX_RDY tag was set. If so, the SATA device enters a secondary idle state, clears the RX_RDY tag, and waits for the host to transmit another X_RDY primitive. If the SATA device receives a X_RDY primitive while in the secondary idle state, the SATA device transitions into a receive FIS state and receives and processes a FIS from the host containing intra-command data (e.g., a free fall event detected). If the host does not transmit an X_RDY to the SATA device, the SATA device transitions into the X_RDY state in order to continue processing the current read command.
As best understood, FIG. 3 represents a bus timing diagram in accordance with an embodiment as disclosed in Stevens. As represented in FIG. 3, when the host attempts to send a command while the device is transferring data 250, the device notes this fact, completes the current FIS transfer, and then waits 255. This leads to a period of bus inactivity 257. When the host completes its back down and aborts the command, it retries at 258 and may be successful (assuming the device has waited long enough). A command is then sent and then the device resumes its data transfers 235. When the host tries to send the next command 275, the process repeats with another period of bus inactivity 276, the command sent 277, and data transfers resuming 278. This cycle will repeat again as represented at 285, 286, 287, 288, and for as long as the host has commands to send.
Therefore, systems of this type still suffer an initial bus inactivity 240, but then proceed to a shorter cycle where, for each further command sent, there is a short period of inactivity, 257, 276, 286, etc., while the device waits for the host to complete its back down and command abort before re-trying the command. This introduces a bus delay for every command that is sent. While the delay per command may be less than the delay per command for the prior art represented in FIG. 2 (delay 240 divided by the maximum number of commands stored in the SATA device, generally 32), the bus utilization is still less than optimal and the IOPS performance figure is reduced.
In view of the above, it can be appreciated that there is an ongoing desire for improved methods and devices capable of providing efficient command and data transfers in the SATA protocol.