A distributed storage system has been known which arranges data and a parity in different nodes. Since the Erasure Code in a distributed storage system arranges data and a parity in different nodes, data updating accompanies a distributed transaction. As one of the known techniques for such a storage system, each nodes records post-updating image data in the form of a journal in preparation for possible node failure, so that the reliability of the transaction can be improved.
Accompanying drawing FIG. 7 denotes a process of writing a journal in a traditional storage system; and FIG. 8 denotes a process of applying the journal in the same system.
A storage system (distributed storage system) 100a of FIGS. 7 and 8 provides a memory region to a non-illustrated host device. The storage system 100a includes multiple (three in the illustrated example) data nodes 1a (data nodes #1-#3) and multiple (two in the illustrated example) parity nodes 2a (parity nodes #1 and #2). This means that FIGS. 7 and 8 illustrate an example that the stripe of the Erasure Code consists of three data chunks and two parity chunks. Each data node 1a is communicably connected to each parity node 2a via, for example, a Local Area Network (LAN) cable.
Hereinafter, each individual data node is specified by “data node #1”, “data node #2”, or “data node #3”, but an arbitrary data node is represented by “data node 1a”. Likewise, each individual parity node is specified by “parity node #1” or “parity node #2”, but an arbitrary parity node is represented by “parity node 2a”. 
A data node 1a stores therein data received from an external device such as the non-illustrated host device, and includes a Central Processing Unit (CPU) 11a, a file store 13a, a journal disk 14a, and a non-illustrated memory. The data nodes #1-#3 have the same functional configuration, FIGS. 7 and 8 omit illustration of the functional configurations of the data nodes #1 and #3.
The CPU 11a is a processor that executes various controls and calculations, and achieves various functions by executing an Operating System (OS) and a program stored in the non-illustrated memory.
The file store 13a is a known device that readably and writably stores therein data received by the data node 1a, and is exemplified by a Hard Disk Drive (HDD) or a Solid State Drive (SSD).
The journal disk 14a is a known device that readably and writably stores therein a journal, which is a record of data received by the data node 1a, and is exemplified by an HDD or an SDD.
A parity node 2a is a node that stores therein a parity of data stored in the data node 1a, and includes a CPU 21a, a file store 23a, a journal disk 24a, and a non-illustrated memory. Since the parity nodes #1 and #2 have the same functional configuration, FIGS. 7 and 8 omit illustration of the functional configuration of the parity node #1.
The CPU 21a is a processor that executes various controls and calculations, and achieves various functions by executing an Operating System (OS) and a program stored in the non-illustrated memory.
The file store 23a is a known device that readably and writably stores therein a parity of data received by the data node 1a, and is exemplified by an HDD or an SSD.
The journal disk 24a is a known device that readably and writably stores therein a journal of a parity to be stored in the file store 23a, and is exemplified by an HDD or an SDD.
Hereinafter, description will now be made in relation to a process of writing a journal and a process of applying the journal in a traditional storage system with reference to FIGS. 7 and 8. For simplification of the description to be made by referring to FIGS. 7 and 8, a process by “the CPU 11a of the data node #2” is referred to as “a process by the data node #2”, and likewise a process by “the CPU 21a of the parity node #2” is referred to as “a process by the parity node #2”.
The data node #2 receives updating data “7” from a non-illustrated host device (see symbol B1 in FIG. 7).
The data node #2 reads pre-updating data “4” from the file store 13a (see symbol B2 in FIG. 7).
The data node #2 calculates the difference between the updating data and the pre-updating data (see symbol B3 in FIG. 7). In the example of FIG. 7, the data node #2 calculates the value “3” to be the difference data by subtracting the pre-updating data “4” from the updating data “7”.
The data node #2 writes updating data (post-updating data) “7” into the journal disk 14a (see symbol B4 in FIG. 7).
The data node #2 forwards the calculated difference data “3” to the parity node #2 (see symbol B5 in FIG. 7).
The parity node #2 receives the difference data “3” from the data node #2 (see symbol B6 in FIG. 7).
The parity node #2 reads a pre-updating parity from the file store 23a (see symbol B7 in FIG. 7).
The parity node #2 applies the difference to the pre-updating parity (see symbol B8 in FIG. 7). In the example of FIG. 7, the data node #2 calculates a post-updating parity “5” by adding the difference data “3” to the pre-updating parity “2”.
The parity node #2 writes the calculated post-updating parity “5” into the journal disk 24a (see symbol B9 in FIG. 7).
Next, the data node #2 reads a journal from the journal disk 14a for applying the journal (see symbol B10 in FIG. 8).
Then, the data node #2 writes the read journal into the file store 13a (see symbol B11 in FIG. 8).
The parity node #2 reads a journal from the journal disk 24a for applying the journal (see symbol B12 in FIG. 8).
The parity nodes #2 writes the read journal into the file store 13a (see symbol B13 in FIG. 8).
[Patent Literature 1] Japanese Laid-open Patent Publication No. 08-87424
[Patent Literature 2] Japanese Laid-open Patent Publication No. 2006-338461
In the traditional storage system 100a, the data node 1a and the parity node 2a include dedicated journal disks 14a and 24a, respectively. With this configuration, the data node 1a writes the post-updating data, as a journal, into the journal disk 14a while the parity node 2a writes a post-updating parity, as a journal, into the journal disk 24a. Accordingly, an increase in the number of parity nodes 2a accompanies an increase in the data volume of the journals, which needs more disk volume. Furthermore, the storage system 100a endures increased Input/Output (I/O) overhead.