In recent years, an active area of research has been foreground object detection. A common method for detecting foreground objects or objects of interest (“targets” in military jargon) in a sequence of time-ordered images/video is through change/anomaly detection in images/video (or “imagery”). Foreground object detection in a pair of time-ordered images has been an active area of research primarily because of its direct application in many intelligence, surveillance, and reconnaissance (ISR) tasks. Examples of ISR tasks include detection of threats and suspicious activities and the tracking of targets. Given the tactical nature of most ISR tasks, it is very important that a change detection system performs robustly—i.e., provide a high true detection rate, low false detection rate, and low missed detection rate in presence of clutter.
Table 1 lists the most common sources of clutter found in electro-optical (EO) imagery. Clutter can be overwhelmingly large as the field of regard increases, such as with newer large format aerial sensors covering a few to several square kilometers. As can be seen from Table 1, shadows are the greatest source of clutter. Therefore, the effective removal of shadows is of prime importance in the development and application of foreground object detection methods.
TABLE 1Typical sources of false alarms and theirrelative abundance in aerial EO imagery.Shadows95.0%Local misregistration2.0%Sensor anomalies1.0%Local intensity changes1.0%Pure parallax0.6%Image defocus0.4%Total false alarms100.0%
Several prior art foreground object detection through change detection techniques employ a reference, such as a single prior image or a dynamically updating background model using video, as is described in U.S. Pat. Nos. 6,546,115, 6,731,799, and 6,999,600. Background modelling is appropriate when one or more static cameras is employed for repeatedly viewing a relatively fixed scene. Unfortunately, a background model cannot be built when there is little overlap between one image and the next image, such as from airborne imagery. Other conventional object detection methods rely on matched filter type responses using stored templates, as described in “Detection Filters and Algorithm Fusion for ATR”, by David Casasent et al., IEEE Transactions on Image Processing, IEEE New York, USA, vol. 6, No. 1, January 1997, pp. 114-125 (hereinafter “Casasent97”). The method described in Casasent97 relies on the generation and detection of digital signatures based on video images stored in a database and compared to signatures generated from the imagery under examination. The method described in Casasent97 is intolerant of the kinds of distortions that may be present in the video being examined due to variations in viewing geometry and distortions due to atmospheric conditions.
A typical approach for the removal of shadows in single spectral images involves a transformation of input color space in which shadows are restricted to single color channel as described in U.S. Pat. No. 7,366,323 and in Salvador et al., “Shadow Identification and Classification Using Invariant Color Models”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3, 2001, pp. 1545-1548. However, image noise in transformed color space tends to be higher compared to that in the original image data. Multispectral image analysis techniques for shadow detection, such as described in U.S. Pat. No. 7,184,890, cannot be directly applied to single spectral images as these techniques exploit the characteristics of individual spectra in information integration.
The prior art generally relies on heuristics or static background/scene knowledge that render existing change-based target detection systems “brittle,” i.e., such systems are likely to fail in an unexpected manner when deviations from the heuristics or background knowledge are large. To avoid large scale failure, most existing systems are operated in a restricted manner, such as during a specific time of the day.
Accordingly, what would be desirable, but has not yet been provided, is a method and system for detecting targets in imagery by analyzing the temporal changes affected by the targets.