The invention relates to a process and apparatus for surveying a given surveillance area using a detection device, especially at least one video camera and/or microphone.
Security surveillance of facilities and spaces takes place on the one hand using specific physical sensors (for example, photoelectric barriers, motion detectors, gas sensors) and on the other hand using video cameras. Video cameras have the advantage that assessment of the situation from afar can be done by an guard and that consequently even relatively complex situations which cannot be acquired using specific physical sensors can be comprehensively surveyed.
To be able to minimize costs for surveillance personnel, generally a larger number of cameras are connected to be able to be switched to a few common monitors. Switching can take place in given cycles or selectively (for example when motion is detected).
One problem of surveillance by video cameras is that the guard becomes fatigued over time. The video images are viewed only superficially or sporadically over time.
The object is to devise a process of the initially mentioned type which makes surveillance with video cameras much more reliable and also more efficient.
In the process, first the characteristic data are determined from an instantaneous signal segment by automatic signal analysis. These data are stored or buffered as a data set (for example with a time stamp) in order to then be statistically compared to data from other data sets which meet certain criteria.
By statistical comparison the surveillance system can recognize extraordinary situations and selectively notify the guard of them. Here it should be watched that the system itself recognizes which situations are extraordinary and therefore must be checked more closely. Nor is it necessary to establish beforehand what should be considered extraordinary in a certain surveillance situation. After a certain start-up time the system by itself has collected the statistics of a ordinary situation. (It is not disruptive if extraordinary situations arise in the start-up time since they do not have significant effects on statistics anyway due to their infrequency).
Within the framework of automatic image analysis the video image is preferably broken down into several segments (image regions). The segments can partially overlap or can be completely disjunctive. For each segment characteristic data or features are determined accordingly. The different segments can be treated the same or differently. In the former case for example the same data set is determined for each segment. In the latter case on the other hand the segments are combined for example into groups, different data sets being computed for different groups. In this way for example it is possible to survey a space within whose partial area there is continuously motion (for example, as a result of public traffic), conversely with another partial area being traversed by only one person (for service).
One simple and effective measure in image evaluation is for example gray level analysis. In the selected segment for example an average of the existing gray level is computed. Histograms of the gray levels (or the color values) can also be determined. In a statistically relevant deviation of the gray level average from the corresponding averages with other time references the video image is switched for example to a surveillance monitor (or an alarm is triggered).
Furthermore data about existing textures, lines and edges can be determined. They can provide information about the position of an article or its orientation. In particular, edges are suitable for determination of the direction of displacement and speed of displacement. The speed and direction can also be computed by comparison of the instantaneous image with one or more previous ones. It can be the immediately preceding (which at an image frequency of for example 25 Hz lags by 1/25 s) or one lagging several cycles. How large the time interval is to be depends on the expected speed of the moving object.
It can be advantageous to identify the moving objects in the image (for example, as a xe2x80x9cpersonxe2x80x9d, xe2x80x9cvehiclexe2x80x9d, xe2x80x9cunknown objectxe2x80x9d). Allowable parameters can be determined for each object (location, speed, direction). In this way for example extraordinary movements can be distinguished from ordinary ones. (A vehicle travelling on the road and an individual moving on the sidewalk are ordinary events, while an individual moving in a certain direction on the road can be an extraordinary event).
The reliability and false alarm rate can be greatly improved by a suitable, i.e. situation-referenced choice of comparison instants. It may be enough in certain surveillance situations if the statistical comparison is simply referenced to a past, succeeding interval (for example, the last thirty minutes). In more complex situations it can conversely be important to establish the time references to be more selective. Statistical comparison can be limited for example to similar time domains (similar times of day, similar days of the week). It is furthermore possible to define the time references by additional parameters. For example, surveillance situations are conceivable in which the temperature plays a part. I.e. that in a statistical comparison only those data are considered which have a similar parameter value (for example a similar temperature). Furthermore conditions can also be considered. For example, it is possible for an event B to be critical only when it follows event A.
According to one preferred embodiment the video images are filed in a FIFO storage. If an extraordinary state of the surveyed object is ascertained, an alarm is triggered. This leads for example to the fact that the guard acquires control and can play the video sequences contained in the FIFO memory. The guard assesses the situation and assigns it to a certain category (xe2x80x9cdangerousxe2x80x9d, xe2x80x9cnot dangerousxe2x80x9d). This result is stored in the system together with the parameter values which have led to an alarm in this case. In later situations it is possible to incorporate the assessment of the guard into the situation analysis. The false alarm rate can be successively optimized in this way.
In a start-up phase it is also conceivable to selectively train the system. For this purpose certain test situations are played through in the surveillance region (for example, a break-in). The guard marks those video images or instants which must lead to an alarm signal. The system then stores the data or parameters which belong to the corresponding image and determines their statistical deviation from those of a normal situation.
The invention is not limited to analysis of video signals. In particular the evaluation of acoustic signals can also be of interest. Preferably spectral analysis is done. The signal is divided for example in segments with a length of 1 to 10 seconds. Each of this signal segments is broken down for example into blocks with a length in the range from 20 to 50 ms which are transferred with a Fourier transform (FFT) into the spectral region.
To extract the characteristic features for example frequency ranges can be stipulated in which the energy distribution is determined. In this way for example travel noises can be identified. By using specific criteria voice noises can be identified. Upon comparison of succeeding signal segments other information can be obtained (for example regular impact of wheels at track joints). If at this point the system in the invention has ascertained statistically relevant deviations (for example, sudden rise of travel noises, unusual voice noises, etc.) this can be used as an indicator for an extraordinary situation (for example, open doors when a train is moving).
In principle the process in the invention is suitable for any surveillance situation. Its special strength however appears especially in complex situations. They can be found for example wherever an area accessible to the public (entirely or partially) is to be surveyed. One example to the surveillance of money machines. With a system in the invention the passenger compartment of a means of transportation (for example, of a train) can also be continuously surveyed.
Surveillance of production facilities and individual process steps should also be mentioned. Larger areas (for example a nuclear power plant) can be surveyed with several cameras. The evaluation in the invention can acquire data of several cameras as a totality (i.e. as a comprehensive data set) so that logic links between the images of different cameras are possible.
Quite generally it is advantageous to combine several detection devices of different types. Assessment of a surveillance situation using audio and video for example is more reliable than if only audio or only video are present. Also chemical detectors or analysis devices can deliver important information. The choice and composition of the different devices or sensor types of course depend on the specific situation.
The following detailed description and totality of patent claims yield other advantageous embodiments and combinations of features of the invention.