With the data of a database system and a file of an information processing apparatus (computer) etc., an encoded column may be replaced with a name or a plurality of different types records may be coupled to generate a record. An example of replacing an encoded column with a name is, for example, a case where a prefecture number is replaced with a prefecture name. An example of a file of an information processing apparatus is, for example, a file in an extensible markup language (XML) format or a comma separated values (CSV) format.
A method for generating a table by joining two unsorted tables may be equi join or non-equi join. In the equi join, when the character strings of the items (fields) as the keys of the records included in two tables match each other, the records are coupled. On the other hand, in the non-equi join, records are coupled using not only the matching key character strings but also other join conditions. As other join conditions, for example, a character string in one table matches a part of a character string in the other cable, a numeric value in one table is included in another's character string a numeric range in the other table, etc.
Furthermore, the type of joining method based on an output record may be inner join, left outer join, full outer join, etc.
FIG. 1 illustrates examples of the inner join, the left outer join, and the full outer join. When a table 103 is generated by joining a journal table 101 and a master table 102 in the inner join, only records 131 through 133 obtained by coupling the record of the journal table 101 with the record of the master table 102 are output as the records of the table 103. The record 131 is obtained by coupling a record 111 of the journal table 101 with a record 123 of the master table 102, and a record 132 is obtained by coupling a record 115 of the journal table 101 with the record 123 of the master table 102. A record 133 is obtained by coupling a record 114 of the journal table 101 with a record 124 of the master table 102.
On the other hand, in addition to the records 131 through 133, uncoupled records 112 and 113 of the journal table 101 are also output as records 134 and 135 of the table 103 in the left outer join. Furthermore, in the full outer join, uncoupled records 122 and 121 of the master table 102 are also output as records 136 and 137 of the table 103 in addition to the records 131 through 135.
Furthermore, merge join is well known as the full outer join capable of realizing the equi join and the non-equi join. In the merge join, the records to be joined in two tables are sorted and then coupled.
FIGS. 2 through 8 illustrate examples of the merge join. First, as illustrated in FIG. 2, the records 111 through 115 of the journal table 101 and the records 121 through 124 of the master table 102 are sorted in the ascending order of the values of the records, thereby generating records 201 and 202.
Next, as illustrated in FIG. 3, the leading records 211 and 221 in the tables 201 and 202 are compared with each other. In this case, since the values of the records 211 and 221 do not match each other, the record 211 having a smaller value “1” is output as a record 231 of a table 203.
Next, as illustrated in FIG. 4, the next record 212 in the table 201 is compared with the record 221 of the table 202. In this case, since the values of the records 212 and record 221 do not match each other, the record 221 having a smaller value “2” is output as a record 232 of the table 203.
Next, as illustrated in FIG. 5, the record 212 of the table 201 is compared with the nest record 222 of the table 202. In this case, since the values of the records 212 and record 222 match each other, the records 212 and 222 are coupled, and output as a record 233 of the table 203.
Next, as illustrated in FIG. 6, between a record 213 next to the record 212 and a record 223 next to the record 222, the record 213 having a smaller value “3” is selected as a record to foe compared, and the record 213 is compared with the record 222. In this case, since the values of the records 213 and 222 match each other, the records 213 and 222 are coupled with each other, and output as a record 234 of the table 203.
Next, as illustrated in FIG. 7, between a record 214 next to the record 213 and the record 223 next to the record 222, the record 223 having a smaller value “4” is selected as a record to be compared, and the record 213 is compared with the record 223. In this case, since the values of the records 213 and 223 do not match each other, and the record 213 having a smaller value “3” has already been output, no record is output.
By repeating the above-mentioned comparison and output of the records until the trailing records 215 and 224 of the tables 201 and 202 are reached, all records of the tables 201 and 202 are output as illustrated in FIG. 8. The record 223 of the table 202 is output as a record 235 of the table 203, and the record 215 of the table 201 is output is output as a record 237 of the table 203. In addition, the record 214 of the table 201 is coupled with the record 224 of the table 202, and output as a record 236 of the table 203.
Also known are the data sorting method using the automaton in which a character string as a key of a record is entered and the data aggregating method using statistical Hydra of a trie structure.
Patent Document 1: Japanese Laid-open Patent Publication No. 2003-44267
Patent Document 2: Japanese Laid-open Patent Publication No. 2006-171800
Patent Document 3: Japanese Laid-open Patent Publication No. 2010-108093