Secondary storage devices such as hard disks in computers generally have greater capacity compared to primary storage devices such as memories and lower performance. Therefore, a cache technique is used widely to replicate and allocate a portion of the data in the secondary storage device to the primary storage device in order to increase the speed of access to data stored in the secondary storage device. Further, there are computers having large-capacity secondary storage devices and primary storage devices, which can provide a data storage function where the computer itself can be used as a secondary storage device of other computers, and such computers are sometimes called External Controller-Based (ECB) Disk Storages.
In an ECB and a host computer using the same, the host computer regards the ECB as a secondary storage device, so that the primary storage device of the host computer itself is used as a cache area of the ECB, and the ECB uses the primary storage device in the ECB as a cache for the secondary storage device. Therefore, the caches of the identical data in the secondary storage device of the ECB are retained in duplexed manner both in the host computer and the ECB, which deteriorates the efficiency of use of the primary storage device of both computers.
A similar event occurs in a virtual machine technique. A virtual machine technique is a technique for emulating the various hardware constituting a computer by the software in a single computer so that the computer behaves as if multiple computers virtually exist. Here, the computer provides a virtual storage device to the virtual machine. At this time, when data is stored in a storage media provided to the computer via a emulated storage device provided to the virtual machine by the computer, the virtual machine uses its own primary storage device as cache. At this time, the computer transfers an input/output request that the virtual machine issues to the emulated storage device and the storage target data in its own secondary storage device, but at that time, the computer itself uses the internal primary storage device as the cache. In the virtual machine technique, the primary storage device and the secondary storage device are shared among computer and virtual machines by dividing the capacity of the storage devices. However, if both the computer and the virtual machines use their own internal primary storage devices as cache, multiple caches of the secondary storage device of the computer will exist. When multiple virtual machines are operated, it may be possible that a cache of the secondary storage device is multiplexed by a number corresponding to the number of virtual machines and the computer and retained in the primary storage device, so that the efficiency of use of the primary storage device is deteriorated. Such state may occur when multiple virtual machines, such as virtual machines performing data search and virtual machines performing data analysis, perform input/output of the same data.
Patent Literature 1 teaches an art of deleting duplicated data in a primary storage device within a virtual machine environment. Patent Literature 1 teaches a technique to refer to detect the duplicated data stored in the primary storage devices of the computer and virtual machines by referring to the contents of data in the primary storage device, and to delete the same.