It sometimes takes a long time to perform a checking process or calculate similarity by using unstructured data, such as an image, audio, sensor data, or the like. Accordingly, there is a conventional technology that improves the efficiency of the checking process by allocating record data to a plurality of computational resources and distributing the process.
FIG. 31 is a schematic diagram illustrating an example of a conventional technology. For example, when record data is checked by using a certain query, there may be a case in which the processing time does not depend on the query but only depends on the record data. For example, when the length of a certain frequency component present in a music file is counted in units of seconds, the processing time depends on the length of the music. In such a case, after solving the mixed integer programming problem, the pieces of the record data are distributed each of the computational resources such that the amount of the processes is almost equal.
In the example illustrated in FIG. 31, it is assumed that record data 10a to 10j are present and assumed that the length of each record data is defined to be the processing time that is needed to process the record data. For example, the record data 10a, 10b, and 10j are distributed to a first server, the record data 10c, 10e, 10d, and 10g are distributed to a second server, and the record data 10i, 10f, and 10h are distributed to a third server. In this way, by distributing the record data 10a to 10j, each of the processing time can be equalized.
Patent Document 1: Japanese Laid-open Patent Publication No. 2003-223344
Patent Document 2: Japanese National Publication of International Patent Application No. 2002-513975
Patent Document 3: Japanese Laid-open Patent Publication No. 2008-21295
Patent Document 4: International Publication Pamphlet No. WO 2013/136528
However, in the conventional technology described above, there is a problem in that, when the data is placed in a plurality of the computational resources, it is not possible to prevent an increase in similarities of data placed in the same computational resource while substantially equalizing the data allocated to each of the computational resources.
For example, there may be a case in which a processing time does not depend on only record data and thus varies depending on a pair of data, i.e., query data and record data. Furthermore, a processing time may sometimes be long as the pair of data is similar. In such a case, it is difficult for the conventional technology to efficiently perform the process even if each of the pieces of record data is distributed to each of the computational resources.