Data accessed from or written to a memory involves transfer of data from a block of memory from one device (for example, a hard drive) to another (for example, a RAM cache). The block of memory that resides in these devices may be further subdivided into smaller chunks that may not be contiguously located. For example, a 4 MB chunk may be located as 4 separate 1 MB chunks anywhere in the memory space of the device. Therefore, some information is needed as to their physical locations so that a memory controller (for example, a Direct Memory Access (DMA) Master or the DMA controller) can then use this information to either collect the data from these separate chunks (Gather) or write data into these separate chunks (Scatter). This is where Scatter/Gather elements are utilized.
The Scatter/Gather element (SG element) contains the physical location of one memory chunk (also called a fragment) along with the size of the data contained in that chunk. A number of SG elements together can describe the locations and sizes of the chunks of memory that make up the block of data to be transferred.
The format of an SG element can be different depending upon the application. For the purpose of uniformity, the IEEE 1212.1 compliant SG element, which is illustrated in FIG. 1, will be described by way of example only. As shown in FIG. 1, a typical SG element has the following fields: a 64-bit Address field 100 that points to the starting location of the fragment in memory; a 32-bit Length field 102 that indicates the amount of data contained in or transferrable to that particular fragment; a 31 bit Reserved field 104 that is set to zeroes; and a 1 bit Extension (Ext) field 106 that indicates whether this element is a pointer to the next SG element or a pointer to a data buffer. This Extension field 106 is needed because the SG elements themselves may not be stored contiguously in memory. In this case, the Address field 100 of an SG element can be used to point to the location of the next SG element in the list. For such an SG element, the Length field 102 is ignored and the Ext 106 bit will be set. An SG element pointing to a data buffer may also have the Length field set to all zeroes, which can mean: that the DMA controller should ignore the contents of this element and move on to the next element in the list; or that the block is empty.
FIG. 2 shows how an SG List (also called SGL, a chained list of SG elements) can be used to completely specify a block of memory in a device. A typical SGL may have only single SG element or may have a large number of SGL elements. A SGL may have segmentation and it can contain one or more SGL segments. Typically, segments are created using a special SG Element called an extension element or a segment descriptor. A typical SG element may also include segment information if the list is segmented either directly or indirectly.
As shown in FIG. 2, Fragments 0 through 4 are located at non-contiguous and random locations in physical memory 108 (which may reside in different memory spaces). The SGL 110 however puts all of these together by having SG elements 112 that point to the starting location of each fragment. As we traverse the list, we appear to have a contiguous logical memory block, whose total size is the combined sizes of all of the fragments. An illustration of such a logical memory block 114 is shown in FIG. 2 for illustrative purposes, though it is understood not to exist physically.
Notice in the example of FIG. 2 that the SGL 110 itself is not contiguously located in physical memory. The fifth SG element of the first set of SG elements points to the next SG element in the list by using the extension capability of the SGL. Also notice that we cannot traverse the list backwards—for example, we cannot go back to the fifth SG element once we traverse on to the sixth one, as we have no information in the sixth SG element that points back to the address of the fifth SG element.
SGLs are commonly used for handling data transfers to non-contiguous memory buffers. A typical Peripheral Component Interconnect Solid State Device (PCI SSD) stripes read request over multiple flash Logic Units (LUNs) which causes input/output (IO) data coming back from the storage side flash drives in an out-of-order fashion. A typical IO transfer involves a command phase and a data phase. During the command phase, the IO process is set up by fetching or creating all the requisite control structures. The actual data transfer is performed during the data phase. Usually, the SGL is a bottleneck in IO transfer. Typically, this bottleneck is resolved by caching the SGL locally. SGL caches are like any other cache structure. Each cache line has few SG elements of an SGL belonging to certain context of data. The SGL cache can implement any existing allocation scheme and cache line replacement policy. In one example, each cache line gets mapped to an IO and stores several SG elements belonging to that particular IO.
FIG. 3 shows a simplified view of a SGL Cache. A typical SGL Cache contains a cache memory 300; a TAG memory and TAG lookup logic 302; a logic 304 handling accesses to the SGL cache; and a logic 306 handling all incoming SGL read from outside world (host or main memory, not shown). The cached SG elements are stored in the cache memory 300. When cache is requested for SG elements, first TAG lookup is performed. If the required SG element is found in the cache memory (that is, the look up results in a “hit”), then the SG element is provided to the requesting agent. The lookup and fetching of the SG element is handled by the access logic 304 shown in FIG. 3. Otherwise, the required SG element is fetched from the host or main memory where the SG element is stored. The SG element read from the host or the main memory storing the SG element and is appropriately written in to the cache memory. The read operation from the host or main memory and the write operation to the cache memory are handled by the read logic 306 in FIG. 3.
Typically, when Flash drives (also called “storage side memory” herein) are accessed, striping techniques are used to achieve higher performance. Since access latency of each Flash drive may be different, the order of data read is at the mercy of the Flash drive characteristics. Consequently, the IO read operation becomes an “out-of-order” transfer. SGL caching becomes inefficient in out-of-order IO transfers, as SG Elements have to be traversed back and forth in the list. Also, every time we have to go backward in the list, we need to begin from the start of the list as SG elements do not contain information regarding the preceding element. Out of order transfer makes the cache traverse up and down the SGL, discarding existing cache contents, fetching new elements and later fetching older elements. This phenomenon is called thrashing and causes heavy performance degradation.
It is, therefore, desirable to provide an improved method and apparatus for handling SGLs for out of order systems.