The transformation between two signals can be estimated by matching samples from one input to corresponding samples in the other. For example, the motion of an object can be estimated by tracking image-features between successive video frames, as described by Horn and Schunck, Determining Optic Flow, Artificial Intelligence, 17, 185-204, 1981. Under the assumption that the corresponding points within the image have the same brightness or colour in adjacent frames, the motion field can be estimated by matching corresponding points from one image to another. The accuracy of such estimates is limited by the content of the images. For example, regions of the image having a uniform brightness or colour will not yield accurate information as to potential motion, because for each point in such a region, there will be many possible matches in the adjacent video frames.
It is shown, however, in Horn and Schunck, that the motion field can be estimated in featureless regions. If the scene consists of locally smooth surfaces, then the resulting motion field will also be locally smooth. Adopting this smoothness as a constraint ensures that the motion estimates in featureless regions are consistent with the estimates obtained in nearby, but more feature-rich regions of the images. The effect of the smoothness constraint is to impose a condition that the directions of neighbouring motion estimates are approximately parallel to one another. In essence, this method imposes a smoothness constraint onto a geometric mapping between matching features.
In Jähne et al, Study of Dynamical Processes with Tensor-Based Spatial Temporal Image Processing Techniques, Proceedings of European Conference on Computer Vision, pp 323-336, 1998, an alternative approach to optical flow estimation is given. In order to detect motion between adjacent frames of video data, the video sequence is represented as a volume in (x, y, t), where x and y are the two spatial dimensions of the video data and t represents time. Adopting this approach, any moving feature will appear as an oriented structure in the volume. For example, a moving point would trace out a line in the three-dimensional space whereas a moving edge would sweep out an oriented plane. Such lines and planes can be detected by evaluation of the structure tensor at each point in the space. The structure tensor is defined in Knutsson, Representing Local Structure using Tensors, Proceedings of the Sixth Scandinavian Conference of Image Analysis, pp 244-251, 1989, in which the structure tensor is represented by a matrix, the eigenvalues of which are functions of the local dimensionality of the signal.
A method of recovering the internal structure of a high-dimensional data-set is given in Roweis and Saul, Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science 290, pp 2323-2326, 2000. The method is based on the concept that a given data point may be represented as a linear combination of its nearest neighbours in the high-dimensional space. For data which lies on a lower-dimensional surface, the neighbouring points can be used to construct a local coordinate system that dispenses with at least some of the original axes. The result of this process is a “local linear embedding”, which facilitates interpretation of the original data.
Miller and Tieu, Colour Eigenflows: Statistical Modelling of Joint Colour Changes, Proceedings of the International Conference on Computer Vision, pp 607-614, 2001 disclose image analysis which considers the changes in ambient lighting. A statistical model of how a given colour in a particular image changes with respect to the illuminant is constructed. This enables prediction, given an input image, of the appearance of the scene under different illumination. Changes in the appearance of the image, for example the parallax introduced by moving the view point, are not considered. This statistical model is akin to imposing the smoothness constraint onto a photometric mapping.