Storage systems have been equipped with a variety of storage functions in recent years. Furthermore, since storage vendors have been marketing these storage functions for profit, enhancing storage function performance increases customer value. Compared to a magnetic disk, a flash memory not only has better basic performance but also features different operating characteristics, and as such, using flash memory is an effective approach to enhancing the performance of storage functions.
Furthermore, when rewriting data, the memory characteristics of a flash memory make it impossible to directly overwrite this data to the physical area in which this data was originally stored. When carrying out a data write to an area in which a write has already being carried out, it is necessary to execute a delete process in a unit called a “block”, which is a flash memory delete unit, and to write the data thereafter. For this reason, in a case where data is to be rewritten, most often the data is not written to the area in which this data was originally stored, but rather, is written to a different area. When the same data is written to a plurality of areas and a block becomes full, a block delete process is performed, and a process is carried out to write only the latest data and to create a free area. This process will be called a “reclamation process” hereinbelow. For this reason, in a package equipped with a flash memory (hereinafter, a flash memory package), a logical address layer, which is different from a physical address, is provided as an address layer that appears to be outside of the flash memory package, and a logical address, which is allocated to a physical address, is changed as needed. Furthermore, since the logical address does not change when the physical address changes, a data access using the same address is possible from outside the flash memory package, thereby enabling good usability to be maintained.
Next, technology for reducing the capacity of stored data will be described. Generally speaking, compression technology is a typical technology for reducing the capacity of stored data. In recent years, a technology called capacity virtualization technology has come into widespread use for reducing the capacity of stored data. Capacity virtualization technology is for showing a host a virtual capacity that is larger than the physical capacity of a storage device included in the storage system, and is realized by a storage controller inside the storage system. This technology makes use of a characteristic by which the amount of data actually stored with respect to the capacity of a user-defined user volume (a storage device as seen from the standpoint of the user) when the user is actually using the storage seldom reaches this defined capacity. That is, whereas a defined physical capacity is allocated when a volume is defined in a case where capacity virtualization technology is not used, in a case where capacity virtualization technology is applied, capacity is first allocated when data is actually stored in the storage system. This enables the capacity of stored data to be reduced, and, in addition, also makes it possible to enhance usability since the user does not have to define the volume capacity exactly, but rather need only define a value with a large enough margin. Patent Literature 1 discloses a system in which, in a storage system comprising a storage controller coupled to a large number of flash packages, both the storage controller and the flash packages possess capacity virtualization technology. In Patent Literature 1, to distinguish between the two, the former is called the higher-level capacity virtualization function and the latter is called the lower-level capacity virtualization function. For this reason, the flash package appears to the storage controller to have a larger capacity than the physical capacity of the actual flash memory. In the capacity virtualization technology, a physical storage area, which is allocated when data has been written, is called a page. As in the past, in Patent Literature 1, the physical storage area allocated when data has been written is called a “page” in the higher-level capacity virtualization technology realized in accordance with the storage controller. However, the physical storage area allocated when data has been written in the lower-level capacity virtualization technology realized in accordance with the flash package is called a “block”, which is the delete unit of the flash memory. In general, the size of a page is highly diverse, but in Patent Literature 1, the size of the page is larger than the size of the block, which is the flash memory delete unit. In a flash memory, whereas the delete unit is generally called a “block” as mentioned above, the read/write unit inside the block is called a “page”. Naturally, the size of the block is larger than the size of the page in a flash memory. However, in Patent Literature 1, the word page is not the flash memory read/write unit, but rather refers to a page in higher-level capacity virtualization. Furthermore, in the present invention, it is supposed that the word page is not the flash memory read/write unit, but rather refers to a page in higher-level capacity virtualization. However, it is not necessarily a requirement for a storage system targeted by the present invention to have the capacity virtualization technology mentioned hereinabove. In addition, formatting is ordinarily performed using a specific pattern, for example, all 0's, prior to storing user data in a storage device. Patent Literature 2 also discloses a technology by which this specific pattern, which is written by the host, is detected by the storage system at this time, and an already allocated page is released. In Patent Literature 1, there is also disclosed a technology, which, in the same known case, notifies the flash memory storage device when all 0's are detected by the storage system, and control is exercised in the flash memory storage device so that data is not allocated to this area.
One typical storage function possessed by the storage system is a copy function. As a typical function, there is a function for holding diversified volumes inside the storage system by treating a main volume as a base. The characteristic feature of volumes such as this is the fact that most of the data is the same as that of the main volume, and only a portion of the data differs. In order to reduce the capacity of the stored data, it is preferable that only the data that differs from the data inside the main volume be stored in the storage system without the data that is the same as the data inside the main volume (that is, the data that has been duplicated) being stored. Recently, a virtual server has come into widespread use as a server technology. A large number (for example, between 10 and 100) virtual servers may be allocated to one physical server. In a case like this, a volume is prepared for each virtual server, and data called a “golden image” (a module comprising a virtual server OS and so forth) may be copied to the volume of each virtual server. The golden image copy stored in the volume is used by the virtual server corresponding to this volume. Data inside the respective virtual server volumes is the same as the golden image volume, and only a portion of the data differs. In order to reduce the capacity of the stored data, it is preferable that only the data that differs from the golden image be stored in the storage system without storing the data that is the same as the golden image (the duplicated data).
In addition, a backup technology for acquiring a backup volume, which becomes the backup for the main volume inside the storage system, is known. This can be broadly divided into two methods.
A first method is a method for using the volume currently being used by the server (specifically, an application on the server) as a basis for forming images of certain points in time (before images) as volumes over a plurality of generations (for example, every day or every month). A second method is a method for using a backup volume of a certain period in time in the past, for example, one month ago, as a basis for acquiring a backup image (after image) each day. According to the first method, pre-update data of a location for which data inside the volume being directly used by the server is to be updated, is acquired, and this pre-update data is stored in a storage area other than this volume. This reduces the stored data. According to the second method, post-update data of a location for which data inside the volume being directly used by the server has been updated, is acquired, and this post-update data is stored in a storage area other than this volume. This reduces the stored data.