In general, an asymmetric clustering file system includes a metadata server (MDS) managing metadata of a file, a plurality of data servers (DS) managing data of a file, and a plurality of client systems storing or searching a file. The metadata server, the plurality of data servers, and the plurality of client systems are connected to and interwork with one another through communication over a local network.
The plurality of data servers provide a single mass storage space using a virtualization technology, and the storage space may be freely managed by adding/deleting the data servers or volumes in the data servers. In consideration of a failure rate in proportion to the number of servers, such a system that manages a plurality of data servers mainly utilizes a method of providing parity for recovery while distributing data such as Redundant Array of Inexpensive Disks (RAID) level-5, or a mirroring technology that provides a copy of data. The mirroring technology is inefficient in terms of storage due to duplicated data storage. For this reason, a data distributive storage structure using parity is preferred in the case of requiring fault tolerance.
When an error occurs in a storage device that stores data, the data distributive storage structure using parity may recover data, stored in the storage device having an error, by using parity. A set of data constituting parity is called a stripe, and the number of parities generated for each stripe means the number of data servers allowing for data recovery without data low even if failure simultaneously occurs. For example, in the case where two parities are stored for each stripe, even if failure simultaneously occurs in two data servers storing data constituting a stripe, data stored in the two failed data servers can be recovered by using two parities and data servers other than the two failed servers.
According to the related art, data storage and parity calculation are symmetrically processed in the data distributive storage structure using parity. In other words, while parity calculation is performed in the units of stripes at the time when file data is stored in a client system, parity and data of a corresponding stripe are simultaneously stored in a plurality of data servers in a distributive manner. However, this causes overhead for parity calculation to be concentrated in a client, thus degrading efficiency.