The present disclosure relates to a method and system for data management, and more specifically, to a data de-duplication method and system which are capable of performing de-duplication on data files in a memory which originate from the same source data.
With the increased amount of data storage and the emergence of cloud memory, there is an increasing need for properly managing the data in the storage space so as to meet the increasing need for data storage. Sometimes, users may store a lot of data which are the same or substantially the same with each other during the storing of data, and thus a data de-duplication technique is proposed to eliminate the redundant data so that the storage utility is improved and the cost of network data transfer is reduced.
For example, contemporaneous techniques for data de-duplication usually eliminate the redundant data having the same bits by comparing the content of two data files one binary bit by one binary bit. When dealing with two data files having different binary bit strings, contemporaneous data de-duplication techniques may not be well done.