It is known that there are methods and models to identify an object (or equivalently a “target”) on a still image or a video frame. Such methods are used for example in infrared search and track (IRST) systems in which there is a sensor acquiring an infrared image of the scene under consideration and generally these images are converted into a greyscale format. This image consists of a two dimensional array of pixels which represent the infrared intensity at various locations. Currently there are systems and methods to extract and match features of the outlines of the input objects. These systems are used for example to determine a target of a known type and it will then be possible to interpret it accordingly. In such a method, it is desirable to represent different outlines efficiently to be able to store them in less space and speed the searching process up.
One such known method uses a curvature scale space (CSS) wherein the outline of the object which is a closed curve is used to generate the CSS. For this purpose, another initial calculation to fit a curve on the contour of the object is generally applied on a binary silhouette image of the object under consideration. In this known method, circle of curvature values are calculated over the closed curve and object descriptors are derived from a scale space representation of the outline. These are represented by graphs and peak values on this graph are used as feature parameters. Using such a representation, various shapes on images can be identified, matched or aligned. One of the main problems with the CSS method is that, it relies on the starting point of calculations on the silhouette curve. Since the silhouette is a closed curve around the object on an image, it does not have a defined starting point and this constitutes a problem when it is required to match or recognize outlines. As a solution to this problem, the peak values are currently being used for matching shapes, which is sensitive to noise and incorrect segmentation of outlines. When the outline is extracted with slight errors, the results may vary significantly with such an approach.
Another currently used method implements an ordering among the peaks on the graph acquired by the CSS technique. For example, peak coordinates are ordered with respect to the peak heights in a current implementation. Yet another technique is using the maximum peak as the starting point and representing the outlines starting from that peak. Again, these methods are prone to noise and incorrect segmentation or incorrect curve fitting.
The well known scale invariant feature transform (SIFT) method on the other hand, uses a scale space representation of a two dimensional (2D) greyscale image and generally the representative features of objects on the pixel image are found by computing difference of Gaussian images forming the scale space. Using this two dimensional method, various objects on pixel images can be represented by a list of salient points, can be compared, identified or matched. Disadvantages of the CSS outline feature extraction methods and the inability of the SIFT methods to find feature points representing only the outline of an object necessitate a new method. In some applications such as infrared imaging systems, targets or objects on the image generally have its features on its outline. Furthermore some objects do not require a complete representation but can be identified by only its outline. Storing feature points only on the outline takes less space and searching and matching will be much faster.
The current methods are not offering a reliable and efficient way of extracting, representing and matching silhouette image contour features on their own and a new methodology is introduced in this document.
The British patent document GB2393012, an application in the state of the art, discloses a method for searching a two-dimensional outline and comprises inputting a query, deriving a descriptor of said outline from a curvature scale space representation of the outline of the object, wherein the peak co-ordinate values of the CSS representation are ordered on the basis of the peak height values.
The United States patent document U.S. Pat. No. 7,430,303, an application in the state of the art, discloses a method for extracting and matching gesture features of image wherein a closed curve formed by a binary contour image of the gesture image is used to form curvature scale space (CSS) image and feature parameters are determined by extracting first plural peaks.
The United States patent document U.S. Pat. No. 6,711,293, an application in the state of the art, discloses a method and apparatus for identifying scale invariant features in an image by producing a plurality of difference images, blurring an initial image to produce a blurred image and subtracting the blurred image from the initial image to produce the difference image.
The application titled “System and Method for Identifying Scale Invariant Features of Object Outlines on Images” and numbered PCT/IB2012/050883 mainly focuses on extracting scale invariant features from closed planar curves (silhouettes) and representing these features on “silhouette feature histograms”. Three main steps in the system and method may be defined as: curve extraction, feature extraction and descriptor construction.
Curve extraction step includes fitting a continuous curve on the contours of the silhouette and arc-length sampling this continuous curve. The next step, feature extraction includes curvature scale space construction and feature selection in this scale-space. The final step, descriptor construction uses the extracted frames as pixels on a rectangular (or radial) image, in which location of each pixel designates feature positions (over the curve) and scales (in the curvature scale space) and colour of each pixel represents the orientation of each feature (on the curve plane). In this final step, these images are matched to each other with a rotation and starting invariant manner, so as to accomplish object recognition tasks.
The technique gives satisfactory results. However due to the nature of the technique, the obtained features are generally extracted from the high curvature regions over the planar curve. Thus, lesser number of features is extracted from relatively smoother silhouettes. In this occasion, the descriptor image becomes sparse (including mostly empty pixels) since number of extracted features is relatively low. Matching two descriptor images with relatively different feature densities, in other words matching two silhouettes, one of which is smoother and the other is much curly, may become inefficient.