1. Technical Field
The present invention relates in general to the field of computers and similar technology systems, and in particular to software utilized by such systems to implement methods and processes. Still more particularly, the present invention relates to a computer-implementable method and system for classifying small collections of high-value entities with missing data.
2. Description of the Related Art
Classification is a problem with many solutions in statistics and neural networks. However, both solution areas are sensitive to missing data. When data is missing, common solutions are to discard entire cases, discard pairs of values, or replace missing values with a surrogate, such as the mean or median. Yet for a small collection, discarding cases or pairs is undesirable because each case accounts for a high proportion of the collection, and missing value substitution is ineffective because means and medians may not be sufficiently representative of an individual, high-value case.
Nevertheless, as the value of individual entities increases to high levels, the benefits of classification may be sufficient to compel a solution, even for small collections with missing data. This occurs in fields where decisions must be made even if the available information is meager and “no action” can be a decision with ramifications as severe as the wrong action. For example, if a couple dozen entities could each result in hundreds of millions of dollars in profit or loss, or defection of an entire customer segment, even approximate classification can be a powerful aid to decision making because the decisions might otherwise be based entirely on intuition.