Computational resources have become less expensive, and processing capabilities has become more sophisticated. Accordingly, the number and size of data sets has been exponentially increasing. It is therefore important to decide how to effectively and efficiently process these data sets. One approach is to use reference data sets and to selectively concentrate on how individual data sets differ from one or more reference data sets. By detecting these differences, a more compact representation of a data set is reached.