1. Field of the Invention
The present invention relates to data processing that is characteristic of a classification of data into two or more data areas associated by an object, such as customer, employee, patient, criminal, product, airplane, facility or nuclear reactor, so that the data will be recorded in and read from the corresponding data areas, respectively. In particular, the present invention relates to an improved technique capable of recording and retrieving data independently of the type of object and the amount of data.
2. Description of the Related Art
When various data related to customers are collected and managed in a storage device, these data are generally divided into plural data areas, for example, tables or files (hereinafter, the data areas are assumed to be “tables”) so that the data will be managed in the corresponding tables, respectively.
For example, it is assumed that information about an ID (customer ID), an event date, event contents, what event medium (a direct mail, e-mail or the like) informs the customer of the event, etc. is recorded in an event history table for managing what event (a sale, a clearance sale or a campaign) and when the event has been offered to the customer, and information about a customer ID, a purchased date, a product name and the amount is recorded in a purchase history table for managing the customer's purchase record. In other words, it is assumed that the customer's information is divided into and managed in two tables, the event history table and purchase history table.
The two tables are associated with each other by basic data such as the customer ID, which makes it possible to retrieve data across the two tables. For example, if a product purchased by a customer for whom a campaign C was conducted is to be found, the event history table is retrieved for the customer for whom the “campaign C” was conducted to obtain the customer ID of the customer. Then, the purchase history table is retrieved by the customer ID to specify the purchase record of the customer ID concerned, After that, all the products purchased are output to a display or the like.
It is considered that the above-mentioned sequence of processing is carried out by two methods: one to execute an explicitly described procedure, and the other to automatically or implicitly, that is, implicitly interpret data by means of a data processing system (database system and the like). In either case, explicitly or implicitly, the two tables are associated with each other by performing retrieval processing by the customer ID.
Detailed information about the customer identified from one table is obtained by searching the other table by the customer ID. This processing is called a table join. The table join, however, is accompanied by plural processing steps, so that as the number of data to be processed increases, more processing time is required, which may be of no practical use.
Various techniques for speeding up the process of table join have been developed. Such conventional techniques are all to record the relationship between tables in a storage device in order to speed up the processing. Therefore, since the process to input and output data from and to the storage device is needed, speeding up the process is limited. Further, since all the relationship between tables needs to be recorded in the storage device, the number of tables increases to increase the information on the relationship between tables, which may result in overhead and increased space in the storage device for detecting and recording the relationship between the tables.
In the table join processing for retrieval for customer information which is dividedly managed in plural tables by a condition across the tables, a processing load applied upon execution of the table join processing increases as the number of customers to be managed increases. As a result, there is a high possibility that the processing load will be too heavy to achieve actual operability. Further, when the number of tables to be processed increases the number of table joins and hence processing load, the processing load may also be too heavy to achieve actual operability.