With the development of globalization and information technologies, enterprise data is witnessing an explosive growth, and the redundancy of the enterprise data is increasing. The technology for deleting duplicate data is a data reduction technology for reducing the storage space of redundant data in a storage system.
In a method for deleting duplicate data in the prior art, a user file is generally divided into multiple data blocks; for those duplicate data blocks, only one data block is reserved and recorded in a data block file; an index relationship between the user file and the data block file is established to delete the duplicate data. Before the user file is modified, a modified file corresponding to the user file needs to be set up, and the modified data block is recorded in the modified file; and an index between the modified user file and the modified file is set up. That is, the index of the modified user file includes two types: index pointing to the data block file and index pointing to the modified file. Because the modified file corresponds to the user file, a large number of modified files may be generated when a large number of user files are modified. When the number of modified files reaches a specific degree, the deletion rate of duplicate data is greatly reduced and the performance in modifying other user files is also affected.