Field of the Invention
The present invention relates in general to computers, and more particularly to lookup-based data block alignment for data deduplication in a computing environment.
Description of the Related Art
In today's society, computer systems are commonplace. Computer systems may be found in the workplace, at home, or at school. A data processing system typically includes a processor subsystem having at least one central processing unit (CPU), an input/output (I/O) subsystem, a memory subsystem and a bus subsystem. The memory subsystem of the data processing system typically includes a data storage system having a controller connected to back end storage. The controller controls the flow of data between the data processing system and the back end storage. The controller includes a cache memory that is typically implemented by static memories. During operation, the cache memory serves as a temporary store for data associated with a write I/O request.
These data processing systems may include data storage systems, or disk storage systems, to process and store data. Large amounts of data have to be processed daily and the current trend suggests that these amounts will continue being ever-increasing in the foreseeable future. For the most part, computing systems face a significant challenge to meet the increasingly stringent demands for storing large amounts of data. An efficient way to alleviate the problem is by using deduplication. The idea underlying a deduplication system is to exploit the fact that large parts of the available data is copied again and again and forwarded without any change, by locating repeated data and storing only its first occurrence. Accordingly, it would be desirable to improve and optimize data deduplication.