The present invention relates to a method for reducing the resolution of an input image, wherein the input image shows a three-dimensional scene that was recorded by a surveillance camera, and wherein the distance between the surveillance camera and the objects in the three-dimensional scene is referred to as the object distance; the present invention also relates to a related device and a computer program for carrying out the method. Methods for reducing resolution without accounting for the object distance are known from the literature.
Video surveillance systems are used to monitor public spaces, such as train stations, market places, street intersections, or the like, and public buildings, such as libraries, agencies, courtrooms, prisons, etc. Video surveillance systems are also used in the private sector, e.g. as alarm systems or for watching over individuals who require attention.
Video surveillance systems usually include a plurality of permanently installed cameras which observe the relevant areas in the surroundings, and they include a possibility for evaluating the video sequences that were recorded using the cameras. While the evaluation was previously carried out by monitoring personnel, automatic evaluation of the video sequences has become more common.
In a typical application of automatic monitoring, in a first step, moving objects are separated from the essentially stationary background in the scene (object segmentation), they are tracked over time (object tracking), and, if relevant movement or patterns of movement take place, an alarm is triggered. One possible design of automatic monitoring of this type is described, e.g., in EP 0710927 B1 which discusses a method for the object-oriented detection of moving objects.
It also appears to be commonly known to approximately determine object properties such as size or speed based on estimated object positions in 3-dimensional space, or based on variables related thereto, using an assumed base plane in the scene (see, e.g. the operating instructions “IVMD 1.0 Intelligent Video Motion Detection, Configuration Instructions” for Bosch IVMD Software). It is also known to automatically control the zoom function/focal length of a PTZ camera (pan-tilt-zoom camera) based on an estimated object distance or the angle of inclination of the camera with the goal of depicting a tracked object independently of a changing position in such a manner that the tracked object is the same size in every image in the video sequence (Bosch AutoDome cameras with AutoTrack function, see EP1427212A1).
Despite the continuous increase in processor performance, it is still a challenge to process video sequences in real time, that is, to perform object segmentation and object tracking in real time. To reduce the processor load, it is common for the resolution of video sequences to be reduced equally in the horizontal and/or vertical directions before the images are processed via down-sampling. In this manner, the amount of computing effort required for every individual image in the video sequence, and therefore, for the entire video sequence, is minimized or at least reduced.
Moreover, it is known from the field of videocoding, e.g. as described in the conference presentation “Ingo Bauermann, Matthias Mielke and Eckehard Steinbach: H.264 based coding of omnidirectional video; International Conference on Computer Vision and Graphics ICCVG 2004, September 2004, Warsaw” that a rectification, e.g. in the case of 360° panoramic cameras and fisheye lenses, of perspective distortions that occur may have an advantageous effect on the subsequent image-processing procedure.
Methods for the targeted rectification or distortion of images or image sequences have furthermore been known for a long time, e.g. from the field of analog and digital photography (rectification of aerial photographs, compensation of converging verticals) and computer graphics (warping and morphing methods).
The prior art also makes known methods for camera calibration, with which important camera characteristic values such as lens properties, and the position and/or orientation relative to objects may be derived automatically based on suitable image sequences.