Modern real-world applications generate large amounts of data that is often uncertain and imprecise. For instance, data integration and record linkage tools can produce distinct degrees of confidence for output data tuples (based on the quality of the match for the underlying entities). Similarly, pervasive multi-sensor computing applications need to routinely handle noisy sensor readings. Some research efforts on probabilistic data management aim to incorporate uncertainty and probabilistic information as “first-class citizens” of a database system. As in conventional database systems, query processing techniques associated with deterministic data rely on effective data reduction methods that can effectively compress large amounts of deterministic data down to concise data synopses while retaining key statistical traits of the original data collection.