1. Technical Field
The present invention relates generally to a method of managing data in an asymmetric cluster file system and, more particularly, to a method of managing data in an asymmetric cluster file system, which configures and reconfigures device and file layouts in situations, such as the situation after copying of data and the situation of addition of a data server in order to prepare for the failure of a data server in the asymmetric cluster file system in which a data distribution pattern is fixed, thereby offering a solution to the failures of the data servers.
2. Description of the Related Art
An asymmetric cluster file system is a system that separately manages the actual data of each file and the metadata thereof, that is, the attribute information of the file. In this case, the actual data of the file is distributed and stored among a plurality of data servers and the metadata thereof is managed by a metadata server. For this purpose, as shown in FIG. 1, an asymmetric cluster file system is configured in a distributed structure such that a metadata server 10 and a plurality of data server 20 are connected over a network 30.
A client first accesses the metadata of an actual data file stored in the metadata server 10 in order to access actual data. The client 40 collects information about the plurality of data servers 20 in which the actual data has been stored, via the metadata. The client 40 accesses the plurality of data servers 20 in which the actual data has been stored using the collected information, and then performs the task of inputting or outputting the actual data.
Meanwhile, if there occurs a failure in the data server 20 of the asymmetric cluster file system or a failure in the network 30, the client 40 cannot perform the tasks of inputting and outputting the desired actual data. In order to overcome this problem, copies of the actual data stored in the data server 20 are stored in some other data servers 20. In this case, it is common to distribute, store and manage two or more copies among different data servers 20 while taking into consideration the data storage cost. This also provides an advantage in which the asymmetric cluster file system maintains copies across the multiple data servers 20, thereby distributing the access load imposed by the client 40. As an example, Korean Patent Application Publication No. 10-2011-0070659 entitled “Method of Copying and Restoring Data in Asymmetric Cluster Distribution File System” discloses a method of copying and restoring data in an asymmetric cluster distribution file system, in which a main partition and a subsidiary partition are separated from each other in a data server and a main chunk and a subsidiary chunk are separated and managed, thereby efficiently processing the copying and restoration of chunks.
In this case, the asymmetric cluster file system should maintain a predetermined number of copies of each of data chunks stored in a failed data server 20 when the failure of the data server 20 is detected. Otherwise it may be impossible to access the corresponding data when the consecutive failures of the data server 20 occur, and thus it is necessary to keep track of the data stored in the failed data server 20 and copy it to some other data servers 20.
Here, the asymmetric cluster file system divides each file into sections having a predetermined logical size. These sections are referred to as “logical chunks.” That is, a file is a set of sequential logical chunks. Furthermore, the actual data of the file is divided into chunks, and these chunks are distributed and stored among the plurality of data servers 20. These chunks are referred to as “data chunks.”
The metadata server 10 of the asymmetric cluster file system supports an arbitrary data distribution pattern for each file. That is, the asymmetric cluster file system stores and manages the address of the data server 20 in which a data chunk has been stored, for each logical chunk of the file. Accordingly, the client 40 accesses the data server 20 in which each logical chunk has been stored, and then performs an input or output task.
However, the metadata server 10 of some asymmetric cluster file system supports only a fixed data distribution pattern for each file. That is, the metadata server 10 does not manage the address of the data server 20 in which a data chunk for each logical chunk of the file has been stored, but stores and manages only a list and the sequence of the addresses of the data servers 20 in which data chunks have been stored, and the index of the data server 20 in which a first data chunk has been stored. Accordingly, the client 40 performs an input or output task after accessing the data servers 20 in the sequence of the list of the data servers 20 in which data chunks have been stored, which starts from the data server 20 in which the first data chunk has been stored.
Although a symmetric cluster file system that supports logical chunks into which each file has been divided can more efficiently manage the data servers 20, it is problematic in that the size of the metadata of each file increases when the size of the file is large or the size of chunks is small, so that the amount of content that should be managed by the metadata server 10 becomes large and the amount of content that will be exchanged with the client 40 also becomes large, thereby imposing excessive load on the overall system.
In contrast, although an asymmetric cluster file system that supports a data distribution pattern can minimize the load of the overall system even when the size of a file is large or the size of chunks is small, it is problematic in that it is difficult to efficiently manage the data servers 20.
The most representative example of an asymmetric cluster file system that supports only a fixed data distribution pattern for each file is a file system based on the version 4.1 protocol standard of a Network File System (NFS) that is the most popularized and widely used. The NFS version 4.1 protocol standard was officially established as Request for Comments (RFC) 5661 in January of 2010 by the Internet Engineering Task Force (IETF) that established and managed numerous Internet-related standards worldwide.
The NFS version 4.1 protocol standard includes, when a protocol-based asymmetric cluster file system is configured, the Parallel NFS (pNFS) protocol used between the client 40 and the metadata server 10 and the representative file layout-type storage protocol used between the client 40 and the data servers 20.
The file layout-type protocol is advantageous in that the construction, control and management of the data servers 20 are easy because it uses NFS-based servers as the data servers 20 and can thus provide a file-based environment that is most familiar to common users. However, the NFS version 4.1 protocol standard stipulates that a control protocol used between the metadata server 10 and the data servers 20 and between the data servers 20 is outside the range of the standard, and does not set it forth. When an asymmetric cluster file system based on the NFS version 4.1 protocol is constructed, a control protocol that is used for the asymmetric cluster file system is required.
A method by which a client performs an operation on a file in an asymmetric cluster file system that supports a layout-type file system protocol will be described with reference to FIGS. 2 and 3.
It is assumed that a client 40 performs a write or read operation on file A 50 that is logically composed of D—1 51 to D_n+2 56.
The client 40 sends an OPEN request (that is, OPEN(A)) to a metadata server 10 in order to perform a read or write operation on file A 50 at step S10. The metadata server 10 prepares metadata for the corresponding file and sends a response, including the file handle value (filehandle=a—0) of the corresponding file, to the client 40 at step S11.
The client 40 sends a LAYOUT_GET request, including the file handle value a—0 received from the metadata server 10, (that is, LAYOUT_GET a—0) to the metadata server 10 in order to find the locations of data chunks for the logical chunks of the file at step S12. The metadata server 10 sends a response, including the ID value of a device (dev_id=1) in which a file having the corresponding file handle value a—0 has been stored and also including file layout information, that is, a list of the file handle values that are managed by the data servers 20a, 20b, 20c and 20d storing the data chunks (that is, filehandle={a—1, a—2, . . . , a_n}), to the client 40 at step S13. Meanwhile, if the client 40 has become aware of the file layout information, this step is not performed.
The client 40 sends a DEVICE_INFO request, including device ID value (that is, 1) received from the metadata server 10, (that is, DEVICE_INFO(1)) to the metadata server 10 in order to find detailed information about the device ID value received from the metadata server 10 at step S14. The metadata server 10 sends a response, including device information having the corresponding device ID value (that is, 1), to the client at step S15. Meanwhile, if the client 40 has become aware of the detailed information about the device ID value, this step is not performed. Here, the device information includes a list of the addresses of data servers in which data chunks for respective logical chunks have been stored (that is, multipath list=[{DS—1}, {DS—2}, . . . , {DS_n}]), the stripe sequence of a list of data servers in which logical chunks have been stored (that is, stripeindices={0, 1, . . . , n−1}), and the index value of a data server in which a first logical chunk has been stored (that is, first_stripe_index=0).
The client 40 derives the address of each data server and a file handle value in the data server from the integration of the response, including the file layout information received from the metadata server 10, (that is, the response at step S13) and the response, including the device information, (that is, the response at step S15). This enables the client 40 to send a write or read request, including the corresponding file handle value, the offset and size of a corresponding logical chunk and actual data content, to each data server in order to write or read actual data at steps S16, S18, S20, S22, S24 and S26.
In this case, values corresponding to the indices of the stripe sequence of the list of the data servers (stripeindices={0, 1, . . . , n−1}) in which the logical chunks have been stored, in the file handle value list (filehandle list={a—1, a—2, . . . , a_n}) included in the file layout information, are used as the file handle values to be sent to the data servers 20a, 20b, 20c and 20d, and the index value of the data server in which the first logical chunk has been stored (firststripe_index=0) starts to be referred to. Furthermore, each data server performs a corresponding operation and sends a response, including the results of the performance, to the client 10 at steps S17, S19, S21, S23, S25 and S27.
Referring to FIG. 2, since the value of a data server index at which a first logical chunk has been stored (first_stripe_index) is 0, the client 40 determines {DS—1}, which is the first value of the list of the addresses of data servers in which data chunks for respective logical chunks have been stored (multipath list=[{DS—1}, {DS—2}, . . . , {DS_n}]). Then the client 40 accesses data server 1 20a, and performs a write or read operation. Furthermore, the client 40 sequentially accesses data servers stored at the corresponding indices of a list of the addresses of the data servers (multipath list=[{DS—1}, {DS—2}, . . . , {DS_n}]) in which the data chunks for the logical chunks have been stored in the stripe sequence of the list of the data servers in which the logical chunks have been stored (stripeindices={0, 1, . . . , n−1}), and sends a write or read operation requests to the corresponding data server at steps S16, S18, S20, S22 and S24.
The values of the corresponding indices of the file handle value list (filehandle list={a—1, a—2, . . . , a_n}) in the file layout information are used as the file handle values to be sent to the data servers 20. Furthermore, the write or read operation of the client 40 on a file is repeated based on the stripe sequence of a list of the data servers in which logical chunks have been stored (stripeindices={0, 1, . . . , n−1}) depending on the size of the file until all operation is completed. That is, if there remains a write or read task to be performed after the client 40 had sent a write or read operation request to a data server at the last position of the stripe sequence of the list of the data servers in which the logical chunks have been stored at step S22 and then has received a response thereto at step S23, the client 40 sequentially accesses the data servers in the list of the data servers in which the logical chunks have been stored in the stripe sequence starting from the first data server, sends a write or read operation request at steps S24 and S26, and receives responses, including the results of the operation, at steps S25 and S27.
Once the client 40 has completed the write operation, it sends a LAYOUT_COMMIT request, including information about the completion and the file handle value a—0 of the corresponding file, (that is, LAYOUT_COMMIT a—0) to the metadata server 10 at step S28. The metadata server 10 updates the metadata information of the corresponding file while referring to the information about the completion received from the client 40 and then sends a response, including the results of the updating, to the client 40 at step S29.
If the client 40 does not need to access file A 50 any longer, it sends a CLOSE request, including the file handle value a—0 of the corresponding file, (that is, CLOSE(a—0)) to the metadata server 10 at step S30. The metadata server 10 performs the task of updating the metadata information of the corresponding file and the task of returning system resources and then sends a response, including the results of the performance, to the client 40 at step S31.
As described above, the asymmetric cluster file system that supports the layout-type file system protocol presents data server multipathing in the storage protocol used between the client 40 and the metadata server 10 to be used in case of the failures of the data servers 20, but does not present a specific solution to the failures of the data servers 20.
The data server multipathing will now be described in detail. When the metadata server 10 responds to a DEVICE_INFO request from the client 40, it may make multiplicate a list of the addresses of the data servers 20 in which data chunks for respective logical chunks have been stored (multipath list=[{DS—1}, {DS—2}, . . . , {DS_n}]), which is selected from corresponding device information, and transfer it. For example, in the above-described example, the client 40 can access only data server 1 20a because the address of the first data server 20 is described as being only {DS—1}. When the address of the first data server 20 is made multiplicate and described as being {DS—1, DS—2, DS_n}, the client 40 can access data server 1 20a, data server 2 20b and data server n 20d in which a data chunk for a first logical chunk is present, and perform a write or read operation using the first file handle value a—1 of the file handle value list.
However, the load of synchronizing file handle values may occur because the same file handle value must be used when the multiplicate data servers 20 are accessed, and additional high load may occur when the task of copying data to prepare for the failures of the data servers 20 is performed.
Since the asymmetric cluster file system that supports a layout-type file system protocol does not present a control protocol used between the metadata server 10 and the data servers 20 and between the data servers 20, an effective control protocol used between the metadata server 10 and the data servers 20 and between the data servers 20 is required for the metadata server 10 to control the data servers 20.
Furthermore, the asymmetric cluster file system that supports a layout-type file system protocol does not present a specific solution that prepares for the failures of the data servers 20. As a result, there is a need for an effective method capable of overcoming this problem while complying with the system protocol. Because of the above-described difference between the data distribution pattern schemes, a method different from a data copying method that is used in a general asymmetric cluster file system is required.