1. Field of the Invention
The present invention generally relates to systems and methods for identifying sets of similar products, and more specifically relates to systems and methods for automatically determining which sets of products are likely to be substitutes.
2. Description of Related Art
A number of retail processes and decisions depend on knowing which products are likely substitutes for other products. These decisions include various forecasting processes which estimate how sales will evolve in response to pricing and promotional changes or in the presence of stockouts. They also include decisions about which products to offer at any point in time, which products to replenish, or which products to offer in the future for planning purposes. By way of example, it can be difficult to identify that Advil and Motrin are similar products by simply relying on the descriptive information in retailer's internal systems for printing shelf tags. Brand names do not always provide information about the content and functionality of a product.
Several approaches can be used to determine possible substitutes. For example, a judgmental approach can be used. In the judgmental approach, business experts associate substitute items with each other. With or without some system support for the process, the experts are asked to assign items to groups of substitutes based on their knowledge of the items and customer behavior. This approach suffers from inconsistencies in knowledge and skill across the company and from practical difficulties of maintaining the information over time.
Another approach can be a statistical approach. In the statistical approach, a model is proposed that relates the quantity sold of one item, e.g. the “target” item, to the pricing or promotional variations for another item, e.g. the “potential substitute”. If the former can be shown to be significantly correlated to the latter, then the items are presumed to be substitutes. This approach suffers from its dependence on there being information on historical conditions suitable to test such a model. These conditions include 1) the target item has sufficient history in which the potential substitute does not exhibit price and promotional variations, 2) the price and promotional variations of the potential substitute occur when the target item is available for sale and in stock, and 3) the price and promotional variations of the potential substitute do not correspond closely to price and promotional variations in other potential substitutes (i.e. ‘co-promotions’).
Even when such conditions are reflected in the data, the nature of retail sales is such that many non-modeled phenomena will affect the sales of the target item, including the fundamental randomness of consumer behavior. The consequence is that, even if the pricing and promotional variations in the potential substitute item do, in fact, influence the sale of the target item, this may not be reflected in the correlation measured by the model, or may not be measured with sufficient confidence to declare that the items are substitutes.