It is becoming increasingly important to be vigilant to protect human lives and assets, especially in public spaces such as bus stops, railway stations, airports, hospitals, schools. Surveillance techniques have been employed in the past and this mostly consisted of capturing video based evidence. This means, setting up infrastructure that is needed to capture videography and also employ processing techniques to gather vital information.
Audio based methods are an alternate technique that can be used to monitor environments to improve safety and security with reliability. In addition audio based methods can provide invaluable support to the surveillance efforts. Another benefit is relatively lower cost in setting up an audio based infrastructure.
Currently available audio-based surveillance systems work on pattern matching, where the temporal patterns of different sounds that occur repeatedly in an environment are learnt. There is typically a reference databank of temporal pattern of sounds corresponding to known events. One technique describes a system and method to record recurring sounds in an ambient environment, and these sounds are compared with sounds pre-captured in a reference database. For instance, sounds that co-respond to a regular routine in a day—such as opening and closing of doors, sound of water boiling, movement of people at a particular time and so on. Count of sound occurrences are maintained and any deviation from a pre-defined threshold is marked as an abnormal, for instance if kettle boils at 7 am instead of 6:30 am, if the door opening sound is less than a pre-defined frequency. Visual information such as an output from a video camera for the environment are used to co-relate any detected abnormality.
Yet another known method is to survey sounds emanating from a target environment, and capture these to create a preparatory database, wherein these are further studied by operators who will use the system. The method further describes how an operator will mark a sound heard in a real scenario and also signals the type of scenario. For instance, sound of glass breaking, or a gun shot in the audio and marked as a deviation. A spectral analysis is performed for the location where the sound is heard and this is compared with similar sounds recorded in the preparatory database to identify events. The process comprises identification of similar sounds and performing a match operation in a reference database before marking it as a definite abnormality.
Often, these systems are unable to raise an alarm in real-time, especially if the event requires the analysis of a temporally lengthy sound signal further, there are false negatives for an alarm raised just because an activity or an event may not be in the exact sequence for a set of activities matched from the reference database. Some of the limitations of existing audio based systems are that they need to be extensively trained to detect what is an uncharacteristic sound and what is not. Secondly, a sound that is normal around some time of the day may not necessarily be normal during some other time of the day or when occurring with another event. Also, current systems do not consider use of mobile sources of sound input.
There is a need for a system to automatically identify an event based on sound superimposed by the inputs about the context. Here the context may mean, time, position or setting of the observation, pre-existing knowledge such as train schedules in a train station, class schedules in a school, etc. Identification of an event should also be determined through self-learning processes configured in the system.