The invention relates to a method for monitoring a surveillance area.
Known methods for the detection and analysis of events in a surveillance area monitored by a monitoring system are based on the analysis (content analysis) of video and optionally audio data, which are detected by appropriate sensors and delivered to analysis modules of the monitoring system. Video cameras, motion detectors, sound sensors, and the like are for instance used as sensors. A monitoring system of the generic type here for instance includes hundreds or even thousands of sensors whose signals have to be processed. This processing also encompasses the comparison of the signals with values stored in a memory device.
To enable a fast search in the memory device and a comparison with the data stored in it, it has proved expedient to index the signals. In the indexes (metadata), typically only the outcomes of this content analysis are stored, but not intermediate outcomes such as texture, histogram, and so forth. If an internal semantic representation is generated at all in the monitoring system, it is generally limited to representing objects in a scene under surveillance detected (such as “moving object”, “probably car”) or to a scene under surveillance itself (such as “outdoors”, “beach”). However, such representations are not generic. Therefore in an inquiry based on such analyses made to a memory device with stored results of monitoring, a search can be made only for previously known events. In conventional multimodal retrieval systems, furthermore, only a search by means of an example (query by example) is possible, but not a search by text description.