The use of data analysis tools has increased dramatically as society has become more dependent on digital information storage. In e-commerce and other Internet and non-Internet applications, databases are generated and maintained that have astronomically large amounts of information. Such information is typically analyzed, or “mined,” to learn additional information regarding customers, users, products, etc. This information allows businesses and other users to better implement their products and/or ideas.
Electronic commerce has pervaded almost every conceivable type of business. People have come to expect that their favorite stores not only have brick and mortar business locations, but that they can also be accessed “online,” typically via the Internet's World Wide Web. The Web allows customers to view graphical representations of a business' store and products. Ease of use from the home and convenient purchasing methods, typically lead to increased sales. Buyers enjoy the freedom of being able to comparison shop without spending time and money to drive from store to store.
Online commerce has continuously developed to bring a more enjoyable buying experience to online buyers. Often, websites require a “log in” and/or utilize a “cookie” to track which buyer is looking at their website. With this information, a business can track purchase parameters such as type, size, quantity, and purchasing frequency. This is valuable information because it allows a company to forecast future sales and to determine what goods are of the most interest to online buyers. Typically, however, people are individual in nature and each person tends to have slightly different likes and dislikes. For example, a company which sells a lot of cellophane tape online might assume that their buyers are utilizing it for craft project building purposes. Since the company also sells colored glitter, they may include an advertisement for glitter next to their tape advertisement on their website. In actuality, however, most of the customers are purchasing the tape for business office use, and the glitter advertisement may even turn some customers away due to the fact that the company does not seem to understand its customer's needs correctly. The glitter advertisement could then even lead to decreased tape sales. Had the company, instead, offered staples and/or paper clips along with the tape, they might have seen increased sales for all of their products as buyers might now perceive their store as a “one-stop shop” for all of their business office supply needs.
Pairing up items for selling is often known as “associative selling.” An effort is made to correlate various items/products based upon a particular buyer's past buying habits and/or the past buying habits of other buyers who purchased similar items in the past. This associative process can also be expanded beyond direct product sales. It can be utilized indirectly to enhance sales such as with television viewing habits. A television company can predict that most viewers of show X are men who prefer rugged sports such as football, extreme-mountaineering, and rugby. This would give the television company a good idea that programming an opera or ballet in this time slot would probably reduce their viewer ratings. Even the existing show could be “enhanced” with more rugged content to increase the size of show X's audience. A successful show with a large audience naturally draws advertisers who want to reach more of their market. Thus, the viewing habits can even be used to provide appropriate commercials that have a high audience acceptance rate for a particular genre of viewers.
A salesperson typically approaches a customer and asks them a series of questions to better understand their likes and dislikes along with their prior purchasing habits. Through this interaction, the salesperson is able to determine suggestions for products this particular customer might like. This same type of “associative selling” is also just as important to online merchants. However, online there is not a salesperson to “size up” a customer and determine their needs and wants. Instead, programs are utilized to determine suggestions for online buyers when they visit a business' website. For example, consider an online buyer who previously bought a dog bowl and a dog bone. Probabilities can be determined that show that it is likely that this person owns a dog. The person might, therefore, be interested in dog related items such as dog collars, leashes, and brushes. Since these items are brought to the attention of the buyer, if it matches their needs, they are more likely to purchase those items than, for instance, an advertisement for catnip or a bird feeder.
Although knowing associations is extremely advantageous, it is also generally very difficult to actually determine those associations. This is generally due to complex computing requirements, difficulty in accessing and retrieving the necessary information, and/or long computational calculation times. Typically, a process reviews the data and looks for patterns in the data along with the frequency in which the patterns appear. These patterns facilitate to determine “association rules” that can be analyzed to determine the likelihood that given particular data an outcome can be predicted. Generally speaking, only the stronger association rules or those above a certain level of frequency are utilized. Thus, an association that occurs less than, for example, five times might be discarded. This frequency threshold limit is also known as the “minimum support.”
For large amounts of data, the review process to determine association rules often requires large amounts of memory. It is common for all available memory to be utilized before all of the data has been reviewed. This causes decreased performance in systems where alternative memory is available such as those that page memory onto a hard drive and the like. Systems without alternative memory might not be able to process the data at all. Thus, memory size has a substantial impact on the quality and ability of a system to fully process large databases. This can cause an ever increasing demand for more memory in order to compensate or an ever increasing processing time while the system accesses alternate memory storage.