In comparing video data and other multimedia data, one cannot expect to find exactly identical data. Therefore, comparison of multimedia data typically uses similarity-based techniques that often measure the similarity of the contents numerically. One of such measurements is known as the earth mover's distance (EMD). The EMD may be used to match images in image retrieval applications. These images may be quite different even if they are views of the same scene because of illumination changes, viewpoint motion, occlusions, etc.
Different features of an image are typically described using various distributions. For example, the texture content of an image can be described by distribution of local energy over frequency. The overall brightness content of a gray-scale image may be described by a one-dimensional distribution of image intensities, and a three-dimensional distribution can play a similar role for color images. The EMD is based on the minimal cost that must be paid to transform one distribution into the other. Given two distributions, one can be seen as a mass of earth property spread in space, and the other as a collection of holes in this space. The EMD measures the least amount of work needed to fill the holes with earth. The EMD is described in more detail in a publication by Y. Rubner, C. Tomasi, and L. Guibas, “The Earth Mover's Distance as a Metric for Image Retrieval,” Technical Report STAN-CS-TN-98-86, Computer Science Department, Stanford University, September 1998.