Conventionally, there is a pattern extracting device in that a combination of explaining variable values, in which a rate at which events having a specific target variable value as a constituent element are contained in all events having the combination of predetermined explaining variable values as a constituent element satisfies a predetermined satisfaction level, is extracted from an event set containing plural events constructed by associating the combination of explaining variable values representing the condition of each explaining variable with a target variable value representing the condition of a target variable (e.g., refer to Japanese Laid-open Patent Publication No. 2007-109012).
The event is constructed by associating the combination of explaining variable values representing the condition of each explaining variable with a target variable value representing the condition of a target variable. For example, the event corresponds to sales data which is collected from each outlet store by POS (Point Of Sale) system every time a credit card is used. Furthermore, the event set corresponds to information containing plural events.
The sales data are constructed by associating an item of goods sold, the total amount of money and a monthly use frequency of a credit card with data ID.
Furthermore, a label representing whether use of the credit card is normal or unjust is allocated in association with the data ID.
The variable is information representing a category such as an item of goods, the amount of money, frequency of use, label or the like.
The variable value is a value representing the condition of a variable and it contains a value represented by a character array (for example, “noble metal”), a numerical value (for example, “50,000 yen”), a value representing a range of the numerical values (for example, “100,000 yen to 150,000 yen”), etc.
The variable value representing the condition of the item of goods is a character array representing the name of goods (for example, “noble metal”, “electrical appliance”, etc.), the variable value representing the condition of the amount of money is a numerical value representing the amount of money (for example, “50,000 yen”, “100,000 yen”), the variable value representing the condition of the monthly use frequency of the credit card is a numerical value representing use frequency (for example, “once”, “twice”), and the variable value of the label is “normal use” or “unjust use”.
Here, when a pattern of goods purchased by a user who uses a credit card once a month is extracted, the following is applied. That is a monthly use frequency of a credit card corresponds to the target variable and “once” corresponds to the target variable value, and an item of goods, an amount of money, and a label correspond to explaining variables, and “noble metal” and “electrical appliance” corresponds to explaining variables values of items, “50,000 yen”, “200,000 yen”, etc. correspond to explaining variable values of the amount of money.
Based on above assumptions, a case is explained in which a pattern of purchase by a user who uses a credit card once a month is extracted.
Firstly, in a conventional technique, among each sales data with an explaining variable value of “once” as a constituent element, rates of sales data with an explaining variable value “noble metal” as a constituent element and that with an explaining variable value “electrical appliance” are compared.
If the rate of sales data with the explaining variable value of “noble metal” as the constituent element is judged to be higher, in the conventional technique, a price range that includes the sales data of the constituent element most (e.g., 100,000 yen to 150,000 yen) is selected.
Then, in the conventional technique, for each sales data having a combination of the selected explaining variable values, “noble metal”, “once”, and “100,000 yen to 150,000 yen” as constituent elements, when the number of events satisfies a predetermined satisfaction level (e.g., 20 or more), and a rate that includes a target variable value “once” as a constituent element satisfies a predetermined satisfaction level (e.g., 75%), the following pattern is extracted. The pattern is a combination of explaining variable values “100,000 yen to 150,000 yen” and “noble metal”, and a target variable value is “once” (i.e., a pattern in that a user who uses a credit card once a month tends to purchase noble metal with a price range of 100,000 yen to 150,000 yen).
In above conventional technique, extracting a pattern of target variable values which are contained at a low rate as constituent elements is difficult. More specifically, target variable values which are contained at a low rate as constituent elements is obscured by other variable values of the target variables. As a result, extracting the pattern is difficult in conventional techniques.
For example, a pattern is assumed in which a rate of sales data that a label of a target variable value “unjust use” as a constituent element is low, but users who use a credit card unjustly tend to purchase electric appliances priced 300,000 yen or more.
A case is assumed, for each sales data that a label is a target variable value “unjust use” as a constituent element, rates of sales data with an explaining variable value “noble metal” as a constituent element, and that of “electrical appliance” as a constituent element are compared. The sales data that this pattern as a constituent element is hard to be reflected on sales data that an explaining variable value “electrical appliance” as a constituent element.
Therefore, some conventional techniques may judge a rate of sales data with a explaining variable value “noble metal” is higher. As a result, the pattern that users who use credit cards unjustly tend to purchase electrical appliances with the price 300,000 yen or more may not be extracted.