The present invention relates generally to storage systems and, more particularly, to data reduction in storage systems.
US2011/0231613 describes remote storage caching technology to use the SSD (Solid State Drive) installed on the server as cache for the data stored in the storage system connected to the server. U.S. Pat. No. 7,870,105 describes de-duplication technology to reduce the amount of data in the storage system. With this data reduction technology, the storage system provides virtual storage area (Virtual Volume, VVOL for short) to the server. The address of the physical storage area corresponding to the partial area of the volume is managed. If the data of two or more partial areas of the volume are the same, the addresses of those partial areas point to one physical area. Thus, the capacity of physical area is reduced. This technology is called de-duplication. Two or more partial areas of difference volume can share the same physical area.
In general, a storage system is shared by multiple servers. By performing the deduplication in the storage system, the data stored in multiple servers are also de-duplicated. Consequently, efficiency is improved. In addition to the de-duplication functionality, the snapshot functionality also shares one physical area.
When the storage system has data reduction functionalities, such as de-duplication or snapshot, the server cannot perceive the data sharing status in the storage system. As such, if the server reads two different partial areas which point to the same physical area, the same data is transferred twice to the server. Moreover, from the view point of the server, since the transferred data are data of different addresses, two identical data are stored in flash memory in the server. The utilization of network and flash memory in the server will be decreased.
When the server has data reduction functionalities, the same data stored in two or more servers are not reduced. For example, OS data will not be reduced, although most of the data are the same. The configuration in which both of the storage system and the server have data reduction functionalities could be considered. However, the data reduction processing overhead will be 2 times. As such, the amount of data transfer between the storage system and the server will not be reduced.