It is known to perform pattern detection based on detection of spatio-temporal interest points in a segment of video data. In the known pattern recognition method, a classifier computes a detection result by comparing histograms of the interest points with reference histograms and assigning a classification of the closest reference histograms. A known support vector machine may be used as classifier for example. For this purpose, the histograms of the interest points are computed by assigning the interest points to histogram bins dependent on feature vectors of the interest points.
Spatio-temporal interest points and feature vectors are defined as follows. A segment of video data corresponds to a succession of images, wherein the position of images in the succession represents temporal position and positions within image represent spatial positions. A spatio-temporal interest point corresponds to coordinates comprising a spatial position r and temporal position t of a time and position where image content changes as a function of position within an image and/or as a function of position in the succession of images. The coordinates may be used to define a region in the segment of video data relative to the interest point, the region consisting of a set of pixel positions with predetermined coordinate offsets to the coordinates of the interest point, e.g. a spatio-temporal block wherein the coordinate offset of each spatio-temporal coordinate is in a predetermined range for that coordinate.
The content of the images in such spatio-temporal regions relative to the detected interest points can be used to extract feature vectors, which may take the form of histograms of pixel values or pixel value gradients in the spatio-temporal regions.
The assignment of interest points to histogram bins (also called quantisation) can be performed with the aid of a decision tree with leaves that correspond to respective ones of quantisation bins and nodes that correspond to decision criteria to be applied to feature vectors to select between different branches from the node.
The decision criterion at each node defines a threshold value for a selected feature value such as the value of a selected component of the feature vector or more generally a selected function of the components of the feature vector. Each detected interest point is assigned to one of the quantization bins, after selecting a path through the tree by applying the decision criteria of the nodes along the path to the feature vector of the detected interest point.
More generally, a “decision forest” may be used, comprising a plurality of decision trees that each correspond to a different set of quantisation bins at the leaves of the tree, to assign an interest point to quantisation bins in each of these sets. An article titled “Fast Discriminative Visual Codebooks using Randomized clustering forests” by Moosman et all in the Annual conference on Neural Information Processing Systems 2006 (EPO reference XP055056764) describes creation of Random forests.
In the known method, at least the decision criteria and the reference histograms are selected using a training process, using segments of training video data. Methods to do so are known per se. The training process for the decision criteria involves selection among possible types of feature values and possible thresholds for each node.
It is also known to use soft decision trees. The basics of decision trees, including soft decision trees are described by Koutoumbras in Pattern recognition (2008) pages 215-221 and pages 261-263 (EPO reference XP002693953). A soft decision tree uses non-binary decision functions, to assign non-binary decision values to nodes in the tree. Soft decision trees have the advantage that small errors in feature values cannot lead to strongly different decision results. Koutoumbras describes the use of a standard soft decision function applied to normalized feature data Quinlan et al describe probabilistic decision trees in Machine learning: an artificial intelligence approach part 3 sections 5.1 and 5.8 (EPO reference XP008160784).
The use of soft decision trees in decision forests is described by Bonissone et al. in an article published in the International Journal of Approximate Reasoning, Vol. 51 pages 729-747 (EPO reference XP027142367). Lefort et al describe use of soft random forests in an article titled “Weakly supervised classification of objects in soft random forests in Computer vision at the ECCV 2010 pages 185-198 (EPO reference XP019150735).
For real video segments pattern recognition by the classifier always involves errors in terms of false positive and false negative detections. It is desirable to reduce such errors.