In recent years, high-speed communication is demanded, and various attempts have been made to achieve a high communication speed. For instance, in order to reduce the amount of data transmission, deduplication may be performed on the target data (transmission target data) to be transmitted. In the deduplication, a receiving-side transfer device holds, in association with an identifier, data received in the past from a transmitting-side transfer device. The transmitting-side transfer device also holds, in association with an identifier, data transmitted in the past. The transmitting-side transfer device determines whether part or all of the transmission target data is already transmitted. For the data already transmitted, instead of actual data, an identifier associated with the data is transmitted to the receiving-side transfer device. The receiving-side transfer device reads data held in association with a received identifier, and handles the read data as the data transmitted from the transmitting-side transfer device.
FIG. 1 is a diagram illustrating an example of deduplication. In the system illustrated in FIG. 1, a terminal 10a and a terminal 10b communicate with each other via a transfer device 5a and a transfer device 5b. The terminal 10a is connected to the transfer device 5a, and the terminal 10b is connected to the transfer device 5b. The transfer device 5a and the transfer device 5b communicate with each other via a wide area network (WAN) 3.
For instance, assume that the terminal 10a transmits data A to the terminal 10b. In this case, the terminal 10a transmits the data A to the transfer device 5a (arrow A1). The transfer device 5a determines whether transmission data received from the terminal 10a is data transmitted in the past. When the data transmitted from terminal 10a is relatively large, the transfer device 5a divides the received data into multiple pieces of data as appropriate, and identifies the data transmitted in the past, and the data included in the transmission data multiple times. In the example of FIG. 1, communications using data A, data B, and data C are already performed in the past between the transfer device 5a and the transfer devices 5b. Thus, each of the transfer device 5a and the transfer device 5b holds the data A, data B, data C in cache 6 (6a, 6b) each in association with an identifier that identifies each data. For instance, it is assumed that each of the transfer device 5a and the transfer device 5b holds the data A in association with an identifier “a”. Similarly, it is assumed that data B is associated with an identifier “b” and data C is associated with an identifier “c”.
In the example of FIG. 1, transmission data is the data A. Thus, instead of the data A, the transfer device 5a transmits the identifier “a” associated with the data A to the terminal 10b (arrow A2). Thus, the identifier “a” arrives at the transfer device 5b as data addressed to the terminal 10b (arrow A3). The transfer device 5b then refers to the cache 6b, obtains data A associated with identifier “a”, and transmits the data A to the terminal 10b as the data transmitted from the terminal 10a (arrow A4).
As related art, a relay device is known which, from among data received from a first device, extracts a duplicated pattern which overlaps with data received from the first device in the past, replaces the duplicated pattern with an identifier associated with the duplicated pattern, and transfers the identifier to a second device. A server is also proposed, which, when data is transmitted, transmits difference data with cache data and an identifier of the cache data.
Related techniques are disclosed in, for example, Japanese Laid-open Patent Publication No. 2015-82296 and Japanese Laid-open Patent Publication No. 2007-299019.
When deduplication is performed in bidirectional communication, data once transmitted from one of the transmitting-side transfer device and the receiving-side transfer device is held in both the transmitting-side transfer device and the receiving-side transfer device. Therefore, in the overall network, the capacity equivalent to the product of the volume of data for deduplication and the number of transfer devices is used for communication using deduplication, and thus the utilization efficiency of the caches provided in the devices included the network is low.