Data stores systems, such as database and file systems, store vast amounts of logically related data in the form of data tables. Queries may be used to perform various utilities and analytics on the data tables to generate desired results. Query planning typically relies on various statistics regarding the contents of the data tables. Statistics determined on columns of data tables, and in particular, the column data values stored in the columns of data tables play a crucial role in query planning. A data store system may rely on these statistics to determine an optimal query execution plan. Some valuable statistics may include the number of unique values (NUV) in a column and the high mode frequency (HMF), which is the highest frequency of a value in a column. The collection of NUV and HMF values may require the underlying data to be sorted on disks because all unique values and their frequencies cannot be stored in a buffer while the data is scanned due to the overwhelming size. Sorting on disks makes the entire statistics collection process expensive from a system resource perspective. Some implementations have provided users with an option to collect statistics from randomly sampled rows or data blocks. This approach can reduce the cost, but the accuracy of estimates is often poor against a “skewed dataset”, which may cause overestimation or underestimation issues. These poor estimates cause the optimization process to result in non-optimal plans such that workload execution suffers from performance degradation. A “single-pass” approach may allow each column value of a column under analysis to be analyzed without the aforementioned disk-sorting. However, due to the unpredictability of column values within a column under analysis, determining statistics may prove overly burdensome. Thus, it would be desirable to allow the single-pass approach to be implemented with a finite range of values representative of the column values of a column under analysis.