A histogram is a graphical chart, such as a bar chart, representing a frequency distribution of data elements where the heights of objects in the chart represent observed frequencies of the data elements. There is often a great variability among the many possible histograms of a data sample that are produced.
Histograms have been in use for approximately 300 years and perhaps were the first and now are the most widely used graphic for quantitative data. The histogram is the most common graph of the distribution of one quantitative variable. Every year millions of individuals look at and may be influenced by histograms.
However, just as a data sample does not necessarily represent a population, a histogram does not necessarily represent a data sample. The appearance of a histogram of a data sample can be misleading. To make informed use of histograms for a presentation, an analysis or a decision, a choice among many possible histograms is required.
When a histogram appearance is used, if it matters, experts may consider all of the others, with certain knowledge that by using this method and systems that the palate has of all the possibilities. Selection and optimality criteria may be applied to the finite set of possible appearances. A clearer understanding is obtained than from simply allowing location and width to vary continuously or haphazardly or according to a procedure unrelated to location and width level sets for the different appearances. It may be of interest to consider issues of human cognition in the context of data grouped with uniformly wide intervals. And in practice, it is, of course, impossible to continuously vary any parameter.
For most samples of data with n data elements, many histogram appearances are possible and many are not. One problem is to determine well defined subsets of all histogram appearances that are possible for a given data sample and to display those histogram appearances and a typical or preferred histogram having an appearance.
Another problem is that for small data samples an error in uniform bin width histograms arises from sampling error and from histogram appearance variability. Thus, it is desirable to provide a method and system for determining histogram appearances from small data samples.