The rapid increase in the amount of data stored in storage systems (such as file servers) has led to an increase in the number and sizes of disks connected to a file server, thereby increasing the cost required to introduce and maintain the disks. In order to reduce this cost, deduplication technologies with which the amount of data saved on a file server is reduced are attracting attention. Main deduplication technologies are classified into block-level deduplication in which duplicates are eliminated on a block-by-block basis and file-level deduplication in which duplicates are eliminated on a file-by-file basis.
File-level deduplication, which is lighter in load than block-level deduplication, is often applied to primary file servers from which high performance is demanded. Common methods of carrying out file-level deduplication are described in Patent Literature 1 and Non Patent Literature 1. A technology described in Non Patent Literature 1, for example, copies a file that matches a policy for selecting a deduplication target to a hidden area of a file system. This file is then converted into a stub file by leaving a reference to its original file which has been copied and freeing up a data block that the file has been using. From then on, when a file that matches the policy is determined as identical to the original file which has already been copied to the hidden area, this file is converted into a stub file to eliminate duplication.
File servers generally have a quota function for limiting the amount of data that a user using the file server is allowed to store in order to manage the cost of data storage appropriately. As a way to provide the quota function to a file server that has a deduplication function, Non Patent Literature 1 describes two methods, one based on the logical capacity consumption and one based on the physical capacity consumption.
In the method based on the logical capacity consumption, the capacity consumption of a user is the logical size of files owned by the user. This method accordingly counts in the logical size of a file that has been processed by deduplication as well even though the file no longer consumes a physical data block. The method based on the physical capacity consumption, on the other hand, counts the size of a physical data block that is actually consumed as the capacity consumption of a user.