Surveillance and security systems are known in the art. Such systems are generally used, for example, in airports, malls, factories and various types of buildings (e.g., office buildings, hospitals, laboratories etc.). Such a surveillance and security system generally includes imagers for acquiring images of an observed scene (e.g., a gate at an airport, the lobby of a building, the perimeter of a factory). The objects in the acquired images are tracked and classified either by a user observing the images or automatically.
U.S. Pat. No. 6,985,172, to Rigney et al., entitled “Model-Based Incident Detection System with Motion Classification” is directed to a surveillance apparatus and methods for detecting incidents. The surveillance system includes a motion detection unit, a digital computer, a visible spectrum video camera, a frame-grabber, a wireless transmitter and an infrared camera. The digital computer includes an input-output means, a memory, a processor, a display and a manually-actuated input means. The computer input-output means includes a wireless receiver and an infrared frame-grabber. The visible spectrum video camera is coupled to the frame-grabber and wirelessly coupled to the transmitter. The transmitter wirelessly corresponds to the receiver. The infrared camera is coupled to the infrared frame-grabber of the computer input-output means. The display and the processor are coupled to the computer input-output means. The manually-actuated input means and the memory are coupled to the processor.
The motion detection unit generates a reference image of a scene, which does not contain any motion objects, and generates a temporal difference image between acquired images and the reference image. Accordingly, the motion detection unit detects motion objects in a particular image and generates a low-level detection motion image (i.e., an image including only the motion objects). Thereafter the motion detection unit spatially and temporally analyzes the low-level detection motion. The spatial analysis includes extracting spatial features according to the low-level detection motion image and the original image (e.g., size shape position, texture and moments of intensity values) and classifies the detected objects (e.g., according to set points, statistics or parametric criteria). The temporal analysis includes object tracking and statistical motion analysis. The motion detection unit then recognizes anticipated or unanticipated motion of the object according to a model-based motion analysis.
U.S. Patent Application Publication US2008/0129581 to Douglass et al., entitled “System and Method for Standoff Detection of Human Carried Explosives” is directed, in one embodiment therein, to a system, which includes RADAR and video cameras. The RADAR transmits an electromagnetic beam which exhibits at least one transmit polarization and at least one receive polarization. At least one video camera is a Wide Field Of View (WFOV) camera and at least one other video camera is a Narrow Field Of View (NFOV) camera. The video cameras may be multi-spectral or hyperspectral video cameras. The WFOV camera collects video data and transmits the data to a processor, which applies motion detection and segmentation software implemented algorithms to separate moving objects from stationary background. A two-sided composite hypothesis test is then applied to classify each detected moving object as being either “human” or “other”. A database of constraints on human characteristics is utilized to classify the object type of each moving object detected. This database contains data elements including, for example, size, shape, thermal profile (i.e., applicable for Infrared or multi-spectral cameras), color (i.e., applicable for color cameras) and motion as an aid in classifying object type. Moving objects that are not consistent with human motion are discarded from further consideration. Detected moving objects that have been classified as human-like are then sent to the motion tracker module along with the video stream collected by the WFOV camera. Information passed to a motion tracker module by a motion detector module includes the video frame number, mask or region of interest delimiters that describe the location of the moving object in the video frame, and the ancillary statistics used in the two-sided composite hypothesis test. The motion tracker module then generates and maintains a track for each object cued by the motion detector module. Tracks are maintained until the object leaves the field of view of the WFOV camera.
A threat motion analysis module examines each track based upon a database of stored characteristics, and determines the threat potential of the moving track. The database contains measurements or parameters that have been characterized as threat motions, either by an operator or from analysis of past threats. These characteristics may take the form of region constraints such as, “anybody who enters this demarked area is considered a threat”, or may consist of time and region constraints such as, “anybody who enters the demarked area at the wrong time”. Other threats could be someone moving toward the protected area at a higher rate of speed than other objects in the area. An alternate strategy would have the threat motion analysis module compare the motion statistics of new tracks to the database estimates of “normal” motion statistics within the surveillance area to determine if the motion of the tracked object represents an anomaly.
At any one time, a track may be classified as a “threat”, a “non-threat”, or “indeterminate”. The threat motion analysis module operates to detect dynamic or emerging threats in the presence of uncertainty by applying a sequential hypothesis test. A sequential hypothesis test is a statistic-based test known to those skilled in the art of statistics. Unlike a normal hypothesis test which outputs a binary “yes/no” answer, a sequential hypothesis test allows for a third answer, “don't know, collect more data”. The idea is that at each point in time, the system collects additional information until it has enough information to make a decision about a given hypothesis. Preset or operator selected parameters within the sequential hypothesis test enable the cost of collecting more data to be incorporated into the optimization. With “indeterminate” threats, the hypothesis test is sequentially reapplied with each new observation until a threat determination can be made or the object has left the field of view. Note that the definition of a threatening motion can vary depending on the scenario. Threatening motions could be defined as an object with a motion vector toward the protected area and/or motion velocities that are markedly different from the average velocity of other objects in the field of view. A plurality of human carried explosives detection systems may be employed and networked to form a collaborative detection and tracking system.
U.S. Patent Application Publication US2004/0130620 to Buehler et al., entitled “Method and System for Tracking and Behavioral Monitoring of Multiple Objects Moving Through Multiple Fields-Of-View” is directed to a system and method of video analysis of frames from a plurality of image sensors (e.g., visible light or infrared sensors). Each image sensor has its own field-of-view (FOV) which may overlap with the FOV of another image sensor. The image sensors monitor a monitored environment. The method includes concurrently tracking, independent of calibration, multiple objects within the monitored environment as the objects move between FOVs, and multiple objects move within one FOV.
The video analysis system may include a receiving module, a tracking module (also referred to as a “tracker”), a classifier and a rules engine. According to one embodiment therein, the receiving module receives a plurality of series of video frames from a plurality of cameras and provides the video frames to the tracking module. The tracking module concurrently tracks a plurality of objects both within an FOV of a single camera and within the FOVs of multiple cameras. The output of the tacking module may be stored in a database. The classifier may perform two different types of classification, static classification and dynamic classification. Static classification refers to a classification procedure that operates on a group of pixels from a single instant in time (i.e., from a single frame of video). This type of classification may include assigning instantaneous properties of the pixel group to the pixel group. These properties may include, for example, size, color, texture, or shape to determine if the group of pixels is of interest. Dynamic classification refers to classification rules that examine a pixel group over a period of time to make the classification. For example, dynamic classification properties include velocity, acceleration, change in size, change in area, change in color, lack of motion, or any time dependent property.
The classifier may include a first pass classifier that is used to remove noisy pixels and other artifacts or external variables. A second pass classifier is used in correlation with the output of the tracking module. This interaction includes any combination of spatial, temporal, image feature and motion output from a tracking system. The second pass classifier looks at the data from the tracking module and compares it with data from other frames. Characteristics of followed objects are analyzed along with a state history of that particular object. Various predetermined characteristics of the pixel group are used. For example, motion information (e.g., velocity), grouping information, and appearance/signature information. The rules engine evaluates tracking data to determine whether specific conditions have been met and may also allow users to search for specific information created by the tracking module that in some instances may also have been processed by the classifier.