As shown in FIG. 1, a histogram 100 is an array of ‘bins’ 101. Each bin corresponds to a range 102 of values of a sampled data set. The bin ‘counts’ the frequency 103 of occurrences of sample values in a particular range. In other words, the histogram represents a frequency distribution of the samples in the data set.
For example, a histogram of a sampled color image ‘counts’ the number of pixels that have the same color values in each bin. Thus, the histogram is a mapping from the sampled data set to a set of non-negative real numbers +R.
From a probabilistic point of view, a normalization of the histogram results in a discrete function that resembles a probability density function of the data set. Histograms can be used to determine statistical properties of the data set, such as distribution, spread, and outliers.
Histograms are used in many computer vision applications, such as object based indexing and retrieval, C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik, “Blobworld: A system for region-based image indexing and retrieval”, Proceedings of ICVS, 1999 and J. Huang, S. Kumar, M. Mitra, W. J. Zhu, and R. Zabih, “Image indexing using color correlograms”, Proceedings of CVPR, 1997; image segmentation, D. A. Forsyth and J. Ponce. “Computer Vision: A Modem Approach”, Prentice Hall, 2002 and S. Ruiz-Correa, L. G. Shapiro, and M. Meila, “A new paradigm for recognizing 3-D object shapes from range data”, Proceedings of CVPR, 2003; object detection, C. Papageorgiou, M. Oren, and T. Poggio, “A general framework for object detection,” Proceedings of ICCV, 1998; and object tracking, D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of nonrigid objects using mean shift,” Proceedings of CVPR, 2000.
A face detector is described by P. Viola and M. Jones, “Robust real-time face detection”, Proceedings of ICCV, page II: 747, 2001. As described by Viola et al., it is possible determine the sum of the intensity values within rectangular windows scanned over an image in linear time without repeating the summation operator for each possible window. For each rectangular sum, a constant number of operations is required to determine the sums over distinct rectangles multiple times. This defines a cumulative or integral intensity image, where each pixel holds the sum of all values to the left of and above the pixel including the value of the pixel itself. The integral intensity image can be determined for the entire image with only four arithmetic operations per pixel. One starts the scan with the window in the top left corner pixel of the image, going first to the right and then down. A function determines the value of the current pixel in the integral image to be the sum of all pixel intensities above and to the left of the current pixel minus the pixel values to the upper left. The sum of an image function in a rectangle can be determined with another four arithmetic operations with appropriate modifications at the border. Thus, with a linear amount of operations, the sum of the image functions over any rectangle can be determined in linear time to construct the integral image.
Unfortunately, it is time consuming to extract and search conventional histograms. Only an exhaustive search can provide a global optimum. Sub-optimal searches, such as a gradient descent and application specific constraints can accelerate the search. However, computer vision applications that rely on the optimal solutions, such as object detection and tracking, demand a theoretical breakthrough in histogram extraction.
Conventionally, an exhaustive search is required to measure all distances between a particular histogram and histograms of all possible target regions. This process requires generation of histograms for the regions centered at every possible point, e.g., pixels. In cases where the search is performed at different scales, i.e., different target region scale (sizes), the process is repeated as many times as the number of scales.
FIG. 2 shows the pseudocode 200 of a conventional histogram search.
Up to now, this conventional approach is the only known solution that guarantees finding a global optimum for a histogram-based search.
It is desired to improve the speed of histogram extraction and searching histograms by several orders of magnitude.