In recent years, video image detection systems have been proposed in various applications for identifying and tracking moving objects. In particular, wireless video surveillance which uses automatic detection to track a moving object has been a key technology in the management of intelligent surveillance systems. Within the field of traffic management as an example, video image detection techniques have been deployed in intelligent transportation systems (ITS) for purposes such as alleviating traffic congestions, advancing transportation safeties, and optimizing traffic flows. By accurately distinguishing vehicles from background objects, an intelligent transportation system may obtain current traffic volumes along a road or even detect and track a particular vehicle.
Conventional moving object detection methods may be classified into three main approaches: Temporal Differencing, Optical Flow, and Background Subtraction.
For the Temporal Differencing related technique, regions of motion could be detected based on pixel-wise differences between successive frames in a video stream. Such technique could be adaptive to dynamic scene changes, but yet it has a tendency to incompletely extract shapes of moving objects particularly when moving objects are motionless.
An Optical Flow related technique may estimate flow vectors of moving objects based on partial derivatives with respect to temporal and spatial coordinates from brightness values between successive frames in a video stream. However, such technique could be sensitive to noise and inefficient for traffic applications due to computational burdens.
A Background Subtraction related technique has been a commonly used technique in video surveillance and target recognitions. By the background subtraction technique, moving foreground objects would be able to be segmented from stationary or dynamic background scenes by comparing pixel differences between a current image and a reference background model of the previous image. The background subtraction related technique has been the most satisfactory method for motion detection.
Many variations of the background subtraction method have been proposed to detect moving vehicles within video sequences in an ideal bandwidth network environment. An Σ-Δ filter technique has been used in the Sigma Difference Estimation (SDE) approach to estimate two orders of temporal statistics for each pixel in a sequence in accordance with a pixel-based decision framework. Unfortunately, the SDE approach may be insufficient for complete object detections in certain complex environments. In an attempt to remedy this problem, the Multiple SDE (MSDE) approach which combines multiple Σ-Δ estimators to calculate a hybrid background model has been developed. Besides the Σ-Δ filter technique, the Gaussian Mixture Model (GMM) has been widely used for robustly modeling backgrounds. Using the GMM model, each pixel value is modeled independently in one particular distribution, and a subsequent distribution of each pixel would be determined based on whether or not it belongs to the background. On the other hand, a simple background model is derived by the Simple Statistical Difference (SSD) method using the temporal average as the main criteria to accomplish the detection of moving vehicles. The Multiple Temporal difference (MTD) method retains several previous reference frames with which the differences between each frame would be calculated. This, in turn, shrinks gaps within the moving objects.
Unfortunately, video communication over real-world networks with limited bandwidths may frequently suffer from network congestions or bandwidth instabilities. This may be especially problematic when transmitting video information over wireless video communication systems. When data traffic congestions occur in a communication network, most users could tolerate a streaming video with a reduced quality rather than a video which lags or stands still. Therefore, a rate control scheme has been introduced as an effective video-coding tool for controlling the bit rate of video streams. To allocate the available amount of network bandwidth and produce variable bit-rate video streams, a rate control scheme would be used with the assistance of using H.264/AVC as an effective implement for video coding. Using this technique, variable bit-rate of video streams are produced to allow superior transmissions in wireless communication systems.
Nonetheless, although the rate-control scheme may increase the efficiency of video stream transmissions over networks with limited bandwidths, its tendency to continuously change bit rates may decrease the ease of detecting moving objects. Hence, the aforementioned state-of-the-art background subtraction methods in variable bit-rate video streams generally may not produce satisfactory detection results.
For example, FIGS. 1(a) and 1(b) show a same streaming video captured by a camera and transmitted over a wireless network. FIG. 1(a) is a frame numbered 550 and has a bit-rate of 1,000 pixels per second, and FIG. 1(b) is a frame numbered 918 and has a bit-rate of 2,000,000 pixels per second. FIG. 1(a) illustrates a pixel 101 of a tree along a road in the frame numbered 550, and FIG. 1(b) illustrates the same pixel 102 (i.e. in the same pixel location) displayed in the subsequent frame numbered 918 of the identical tree along the road as the frame numbered 550. FIG. 1(c) shows a comparison among data of the same abovementioned pixel from which its intensity variations in luminance (Y) component as time progresses. In this scenario, when the network bandwidth is sufficient, the rate control scheme would typically increase a low bit-rate video stream to a high bit-rate video stream in order to match the available network bandwidth. The background pixel value fluctuation 103 would often be misinterpreted as a moving object under a conventional background subtraction technique.
For another example, FIG. 2(a) shows a frame numbered 55 and has a bit-rate of 2,000,000 pixels per second, and FIG. 2(b) shows a frame numbered 209 and has a bit-rate of 1,000 pixels per second. FIG. 2(a) illustrates a pixel 201 of a tree on a road displayed in the frame numbered 55, and FIG. 2(b) illustrates the same pixel 202 (i.e. in the same pixel location) displayed in the subsequent frame numbered 209 of a moving vehicle and the tree along the road. FIG. 2(c) shows a comparison among data of the same pixel from which its intensity variations in luminance (Y) component as time progresses. In this scenario, after the bit-rate is switched from a high-quality signal to a low-quality signal, the pixel value fluctuation would often disappear and the pixel value indicating a moving object 203 such as a moving vehicle would often be misinterpreted as a background object by using a conventional background subtraction technique.
In response to the aforementioned problem of misidentification resulted from fluctuating qualities of video stream transmission, a new scheme of moving object detection method is proposed in order to enhance the accuracy of image detection under the circumstance of having variation in bit-rate video streams over real-world networks with limited bandwidth.