FIG. 19 illustrates a view illustrating an example of a flat file. FIG. 20 illustrates a view illustrating an example of extraction of records from flat files. A flat file is, for example, a file which holds information including a plurality of items (item names) of a Comma Separated Values (CSV) format or an eXtensible Markup Language (XML) format, and holds a plurality of records of the CSV format as illustrated in, for example, FIG. 19. In a record, a plurality of items (columns) is partitioned by delimiters such as commas. According to a system which uses flat files, formats (e.g. columns) of data of flat files may momentarily change, the files can be stored as is without falsification and a Relational DataBase (RDB) scheme definition does not need to be determined in advance.
As illustrated in, for example, FIG. 20, records including items of {“name”, “address”, “age”} are held in a file 1, an item of a “blood type” is added in a file 2, and an item of an “age” which becomes unnecessary is deleted from a file 3. Thus, the system which uses flat files can extract records matching search conditions (e.g. a “mountain” is included in an item of a “name”) specified by a user, from a plurality of files 1 to 3 having different data formats.
A system which accumulates a various items of and a great amount of operation data in a flat file data storage and takes advantage of the data is used in terms of the above advantage. For example, this system is used to manage, for example, journal data of an Automatic Teller Machine (ATM) upon analysis of sales data and inventory data of a Point Of Sale (POS) system and placement of an order.
By the way, when a great amount of non-typical data is stored, the system which uses flat files is demanded to search for (extract) operation data at a high speed and takes an advantage of the operation data as appropriate. In addition, the non-typical data is data including various changes in columns and is, for example, table data or journal data in Portable Document Format (PDF) document. FIG. 21 illustrates a view for explaining an example of a process of narrowing down record extraction target flat files. In addition, a flat file will be simply referred to as a file below.
As illustrated in FIG. 21, the system includes a file storage region in which files are stored, and a management region (management database (DB)) in which meta information of the files is held.
Upon storage of a file, the system determines in advance an item name (e.g. “item 1”) which serves as a key to be used to narrow down files. Upon receipt of a storing request of the file 1 from the user (step S101), the system obtains information of the “item 1” (e.g. maximum value: 10 and minimum value: 2 of a column) which is an item which serves as a key when the file 1 is opened (step S102). Further, the system stores the obtained information of the “item 1” as meta information in the management region, and stores the file 1 in the file storage region (step S103).
Next, upon receipt of an extraction request of data (record) satisfying “item 1>13” as search conditions from the user (step S104), the system makes an inquiry to a management DB (step S105) and removes files of meta information which do not match the search conditions, from search targets. In addition, in case of FIG. 21, a maximum value of the “item 1” of the file 1 is 10 and a maximum value of the “item 1” of the file 2 is 15. Therefore, the file 1 which does not satisfy the search conditions of “item 1>13” is removed from search targets. Further, the system opens only the file 2 stored in the file storage region, and performs a process of extracting records matching the search conditions from the file 2 (step S106).
Thus, in the example illustrated in FIG. 21, files which do not need to be searched are removed from a great amount of stored files and search target files are narrowed down. Consequently, speeding up data search is realized.
In addition, a relevant technique is, for example, a technique of adding secondary information such as a keyword for searching for primary information, to this primary information obtained by converting document into an image signal to store and register in a memory, and searching for and reading the necessary primary information from the memory using this secondary information (e.g. following Patent Literature 1).
This technique sequentially stores material numbers of primary information which needs to be registered, in a material number list file in which material numbers of primary information including keywords are stored per keyword upon registration. Further, this technique creates, for keywords in the secondary information which are search conditions, a bit map memory in which bits are allocated using keyword seeds and material numbers as addresses upon search, and reads from the memory only the primary information found based on the bit map memory.
Patent Document 1: Japanese Laid-open Patent Publication No. 01-237723
Patent Document 2: Japanese Laid-open Patent Publication No. 2000-242536
In the above example illustrated in FIG. 21, when an item name which serves as a key is not included in the search conditions, all files stored in the file storage region are search targets. That is, when an item name which serves as a key is not included in the search conditions, open cost of a file increases since it is difficult to narrow down files compared to a case where an item name which serves as a key is included in the search conditions, and a search response lowers.
Further, the above relevant technique does not take into account that primary information is deleted from a management table (the material number list files and the bit map memory). That is, information of the management table is not configured to be updated in response to, for example, deletion of the primary information. Therefore, when the primary information is repeatedly registered and deleted, an unnecessary process of reading non-existent primary information occurs and a search response lowers.