A storage system is a processing system adapted to store and retrieve information/data on storage devices (such as disks). The storage system includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the storage devices. Each file may comprise a set of data blocks, whereas each directory may be implemented as a specially-formatted file in which information about other files and directories are stored.
The storage operating system generally refers to the computer-executable code operable on a storage system that manages data access and access requests (read or write requests requiring input/output operations) and may implement file system semantics in implementations involving storage systems. In this sense, the Data ONTAP® storage operating system, available from Netapp, Inc. of Sunnyvale, Calif., which implements a Write Anywhere File Layout (WAFL®) file system, is an example of such a storage operating system implemented as a microkernel within an overall protocol stack and associated storage. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
A storage system's storage is typically implemented as one or more storage volumes that comprise physical storage devices, defining an overall logical arrangement of storage space. Available storage system implementations can serve a large number of discrete volumes. A storage volume is “loaded” in the storage system by copying the logical organization of the volume's files, data, and directories, into the storage system's memory. Once a volume has been loaded in memory, the volume may be “mounted” by one or more users, applications, devices, and the like, that are permitted to access its contents and navigate its namespace.
A storage system may be configured to allow server systems to access its contents, for example, to read or write data to the storage system. A server system may execute an application that “connects” to the storage system over a computer network, such as a shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. The applications executing on the server system may send an access requests (read or write requests) to the storage system for accessing particular data stored on the storage system.
A received write request may comprise an inode number, one or more blocks of data to be written (referred to herein as “write blocks”), and a file block number (FBN) for each write block. In general, the file system may represent each file using an inode data structure and assign, for each file and corresponding inode data structure, a unique inode number. Each data block of a file may be uniquely identified within the file by a file block number (FBN) that indicates the ordering position of the data blocks within a file. Note, however, that a FBN for a data block does not necessarily indicate the physical storage location of the data block on a storage device. As such, each data block in the file system may be uniquely identified within the file system by the combination of an inode number (indicating the file and inode data structure containing the data block) and an FBN (indicating the ordering position of the data block within the file).
When receiving write requests, the storage system may implement queuing techniques to reduce file fragmentation, and thus reduce read latency. In general, when the storage system receives a write request containing one or more write blocks, the file system may assign, to each write block, a logical block number (LBN) that specifies the physical storage location of where the write block will be stored on a storage device. As such, each received write block may have an associated file identifier (inode number) indicating the file that contains it and FBN (indicating its ordering position within the file), and will be assigned an LBN (indicating the storage location on a storage device).
If the storage system does not perform queuing techniques and assigns LBNs as the write requests are received, significant file fragmentation may result, thus causing significant read latency later. For example, assume a first write request for file A and FBN 10-12 is received, a second write request for file A and FBN 2-9 is then received, a third write request for file A and FBN 13-21 is thereafter received, and write requests for other various files were received between the first, second, and third write requests. If the storage system does not perform queuing techniques and assigns LBNs as the write requests are received, the file system may assign, for example, LBN 119-121 to FBN 10-12, LBN 222-229 to FBN 2-9, and LBN 381-389 and FBN 13-21 for file A. As such, the data blocks of file A are stored on a storage device in an interspersed and non-contiguous manner with significant file fragmentation. Upon receiving a later read request for file A, data blocks of file A are retrieved from the storage device at interspersed and non-contiguous storage locations, thus incurring significant read latency. For example, if the storage device comprises a disk device, the data blocks of file A may be stored on different tracks of the disk device, thus a later read request for file A will incur significant seek times (the time it takes to move a read/write head of the disk device to the different tracks).
As such, queuing techniques have been developed to reduce file fragmentation and read latency. In general, when the storage system receives write requests, the file system does not assign LBNs to the write blocks as they are received, but rather queues the write requests and delays assignment of LBNs until a batch of write blocks are received. For example, the file system may assign LBNs at a consistency point that is triggered at predetermined time intervals or when a predetermined number of write blocks have been received. At the consistency point, for each file having a write request, the file system may produce an ordered list of the FBNs of the received write blocks (e.g., from lowest to highest FBN). In the example above, for file A, the file system may arrange the FBNs of the write blocks in the following order: FBN 2-9, FBN 10-12, and FBN 13-21. The file system may then assign consecutive LBNs to the write blocks of each file according to the ordering of the FBNs of the write blocks produced for each file. In the example given above, the file system may assign, for example, LBN 119-126 to FBN 2-9, LBN 126-128 to FBN 10-12, and LBN 129-137 to FBN 13-21 for file A. As such, the data blocks of file A are stored on a storage device in a more contiguous manner with less file fragmentation. Upon receiving a later read request for file A, data blocks of file A are retrieved from the storage device at more contiguous storage locations, thus incurring less read latency.
A conventional queuing technique uses binary search trees (BST), such as a red-black search tree which is a type of self-balancing binary search tree. A BST may comprise a node-based binary tree data structure and comprise nodes having parent-child relationships. A single BST may represent FBNs of write blocks received for a single file, a single node representing an FBN of a single write block. As write blocks are received for a file, each write block is represented by a node that is inserted into the BST for the file (and in particular, the FBN of each write block is represented by a node that is inserted into the BST for the file). At the consistency point, the nodes of the BST may be traversed to produce a list of the FBNs of the received write blocks, the FBNs being ordered from lowest to highest FBN. The LBNs may then be assigned according to the ordered list of FBNs.
Although BSTs (such as a red-black trees) provide for fast and efficient search and traverse operations, use of BSTs in queuing techniques also requires extensive memory resources. In a conventional BST, each node may itself comprise a data structure containing a plurality of data fields used for representing a single FNB of a single write block. For example, a BST is typically configured such that each node (representing an FBN) has one parent node, one left child node (representing a lower FBN), and one right child node (representing a higher FBN). When a node represents an FBN, the FBN comprises a “key value” of the node. As such, each node may contain 4 data fields, including fields for: its key value (i.e., the FBN it represents), a pointer to the parent node, a pointer to the left child node, and a pointer to the right child node. If each data field comprises one word (e.g., 32 bits), each node in a BST may require 4 words of data. For example, a BST comprising 32 nodes, for representing 32 FBNs, requires 128 words of memory space. For a BST comprising a red-black tree, each node may also contain an additional color field for indicating the associated color of the node (red or black). As such, each node in a red-black tree may require 5 words of data, whereby 32 nodes, for representing 32 FBNs, requires 160 words of memory space.
When receiving large numbers of write blocks, the amount of memory space need to process the write blocks using conventional queuing techniques may become very large. As such, there is a need for a method and apparatus for queuing received write blocks for assigning LBNs that is more memory efficient.