Abnormality detection techniques can detect, as abnormal data, data having features different from those of the other data in a number of data sequences, and these techniques are applied to wide fields such as defect detection, image recognition, and data mining.
For example, on a printed board, similar patterns are often put successively side by side, and therefore, a defect of the board can be detected by detecting a pattern different from peripheral patterns.
Further, by detecting a pixel different from peripheral pixels in an image of an ocean surface, a drowning person can be detected, and hence the abnormality detection techniques can also be applied to sea rescue.
Furthermore, the techniques can also be applied to such behavior mining as to extract a behavior different from a usual case from a behavior pattern.
There are a number of patent documents that disclose these types of abnormality detection techniques.
For example, in a pattern inspection apparatus described in Patent Document 1, a differential image between an image to be inspected and a reference image is sought, and at the same time, an error probability indicative of the degree of a defect is obtained from a pixel value of the differential image. Then, this probability is compared with a predetermined threshold value to determine the defect.
In an image processing algorithm evaluation device described in Patent Document 2, on the basis of features distributions of a pseudo defect group and a true defect group, a separation degree between the groups is calculated. Then, the calculated separation degree is used as an evaluation value to adjust a parameter of an image processing algorithm.
Further, in an abnormal area detection device described in Patent Document 3, a distance between an abnormal area and a normal area is measured on the basis of a high-dimensional local autocorrelation every pixel of image data, and a pixel distant more than a predetermined distance is determined to be abnormal.
Patent Document 1: JP-A-2004-101214
Patent Document 2: JP-A-2006-085616
Patent Document 3: JP-A-2007-334766
However, according to the descriptions of Patent Documents 1 to 3, abnormal data is determined on the basis of a distance (a difference from a reference value or the like) between features of pieces of data, and therefore the following problems are present therein.
The first problem is that the conventional techniques cannot be applied, if features include a defect or an outlier.
For example, in the case of the distance used in the abnormal area detection device described in Patent Document 3, even when two pieces of data are similar, the distance between the pieces of data becomes large, if a portion of the features includes a defect or an outlier and a difference between the features of the portion becomes large. In consequence, the data that is not essentially abnormal is disadvantageously determined to be abnormal owing to the defect or the outlier of the features.
The second problem is that the conventional techniques cannot be applied, if a dimension of features of data is high.
That is, when a usual distance is used and if the dimension is high, the determination of similarity of the data becomes unstable in the case of a usual distance scale. This reason is that with the usual distance scale, the contribution of a component having a small distance among components of a high-dimensional pattern is noticeably smaller than the contribution of a component having a large distance. In consequence, the influence of the component having a large distance such as an outlier is greater than the contribution of the component having a small distance, and hence the determination of the similarity is unstable.