1. Field of the Invention
The invention relates to an image processing method and an image processing apparatus using the same.
2. Description of Related Art
Video surveillance systems have wide areas of scientific and technological application, such as computer vision, transport networks, elderly care, traffic monitoring, general analysis, endangered species conservation, and so on. The architecture of video surveillance system applications encompasses several tasks such as motion detection, identification, object classification, tracking, behavior recognition, and activity analysis. Motion detection is the first essential process in video surveillance systems and plays an important role during which the extraction of moving objects in video streams is accomplished. Many motion detection approaches have been proposed by which to achieve complete and accurate detection in sequences of normal visual quality. There are several major categories of conventional motion detection approaches. These include optical flow, temporal difference, and background subtraction. Optical flow can achieve detection very well by projecting motion on the image plane with proper approximation. However, this method is very sensitive to noise and can be computationally inefficient. For example, outdoor scenes are not computationally affordable for real-time applications. Temporal differencing detects moving objects by calculating the difference between consecutive frames, and can effectively accommodate environmental changes. However, it has a tendency to extract incomplete shapes of moving objects, particularly when those objects are motionless or exhibit limited mobility in the scene. To solve this problem, the very popular motion detection method of background subtraction is often used. Background subtraction detects moving objects in a video sequence by evaluating the pixel feature differences between the current image and a reference background image. This method not only provides very high quality motion information, but also demands less computational complexity than other motion detection methods. All in all, the background subtraction technique is the most effective method by which to solve motion detection problems.
The need for precise motion detection has increased dramatically since the 9/11 attacks, which has subsequently led to higher demand for a more reliable and accurate background model generated through background subtraction. As a consequence, many background subtraction-based methods have been proposed to segment moving objects in video sequence.
In recent years, research conducted in the area of video surveillance systems has been oriented towards the low-quality video streams prevalent in many real-world limited bandwidth networks. Handheld media and mobile devices have gained popularity, as have real-time video applications on wireless networks such as video conferencing, security monitoring, and so on. However, video communications over wireless networks can easily suffer from network congestion or unstable bandwidth. The quality of network services is seriously degraded whenever the traffic exceeds the available amount of network bandwidth. Rate control is an important video coding tool which attempts to lessen video quality and produce lower bit rate video streams in order to match the available wireless network bandwidth, thereby minimizing network congestion. In general, most background subtraction methods suffice for situations involving normal video quality.
However, complete and accurate motion detection in variable bit-rate video streams is a very difficult challenge. The main reason for this is that the generated background models of the previous background subtraction methods may not be applicable in the different real-world bandwidth networks with variable bit-rate compressed video.
Specifically, the previously proposed background subtraction methods did not consider the situation that the qualities of images to be processed may adaptively change. If these methods are applied to deal with images with variable bit rates, the generated background models may easily lead to error detections or false alarms.
Taking FIG. 1A as an example, FIG. 1A is a schematic diagram illustrating the situation about generating the background models according to high-quality images. It is assumed that the 0th frame to the 300th frame, which correspond to high-quality images, are adopted to generate the background models, and the frames after the 300th frame are low-quality frames. As could be observed in FIG. 1A, the background model generated based on high-quality images would be inherently full of fluctuations. Therefore, when a moving object occurs around the 310th to the 350th frame, which corresponds to low-quality images, the moving object would be considered as just fluctuations similar to the fluctuations existing in the background models. That is, the moving object cannot be correctly recognized, and hence the error detection occurs.
Taking FIG. 1B as another example, FIG. 1B is a schematic diagram illustrating the situation about generating the background models according to low-quality images. It is assumed that the 0th frame to the 600th frame, which correspond to low-quality images, are adopted to generate the background models, and the frames after the 600th frame are high-quality frames. As could be observed in FIG. 1B, the background model generated based on low-quality images would be inherently smooth. Therefore, the fluctuations occur at the frames after the 600th frame would be considered as corresponding to moving objects, even though the fluctuations are just disturbances inherently included in high-quality images, i.e., background signals. That is, the background signals might be accidentally considered as corresponding to moving objects, and hence leads to a false alarm.