The present disclosure relates to a system and method for removing false foreground image content, such as headlight and shadows that may distort an object, in a foreground detection process performed on a video sequence. While the present disclosure is contemplated for use in projects concerning Video Analytics for Retail (“VAR”) including, inter alia, merge point management and measurement of drive-thru total experience time (“TET”), it is appreciated that the present exemplary embodiments are also amendable to other like applications.
A detection of foreground objects and/or moving objects (“foreground/moving objects”) is part of a video-based object tracking operation used in computer vision applications, such as surveillance and traffic monitoring, etc. Example applications can include, inter alia, vehicle speed estimation, automated parking monitoring, vehicle and pedestrian counting, traffic flow estimation, measurements of vehicle and pedestrian queue statistics, and TET measurements in retail spaces, etc.
Two of the most common methods of motion detection used in applications that perform analytics on video data include frame-to-frame differencing and background estimation and subtraction (“background subtraction”). The frame differencing approach detects moving objects, typically by requiring tuning to a very narrow range of object speed relative to the frame rate and camera geometry. The background subtraction approach detects foreground objects rather than moving objects. However, moving objects can trigger foreground detection because their appearance differs from the background estimate. Background subtraction is more flexible in terms of adjusting the time scale and dynamically adjusting parameters in the background modeling. Although the term “subtraction” refers to a ‘minus’ operation in arithmetic, it often refers to the removal of a component of an image in computer vision and video processing applications. As such, the term can refer to operations including pixel-wise subtractions (′minusi operation) between images and pixel-wise fit tests between an image and a set of corresponding statistical models.
Still, one challenge associated with the detection of foreground/moving objects is a removal of false foreground objects. Shadows or vehicle headlights (either the light itself or its reflection on the ground)—both cast by and moving with an object across a scene—may cause problems in computer vision applications when the shadow and/or headlight pixels cast from the moving objects are misidentified as foreground pixels. Mainly, the appearance of shadow and/or headlight pixels differs from the background, making the pixels detectable as foreground objects. Also, the shadows and/or headlights have patterns of motion similar to the objects casting them, thus triggering false foreground object detections. Shadows can be especially problematic when they touch other moving objects in the scene. FIG. 1A shows an example scenario where the shadow 12 of a first vehicle 14 touches a second vehicle 16 in the scene, thereby making it difficult to identify the first and second vehicles 14, 16 as separate moving objects. Headlights can also be problematic where the ground reflection is not connected to the moving object casting the light. FIG. 2 shows an example scenario where the ground reflection is identified (by virtual box 18) as a first object and the vehicle 20 casting it is identified as a separate object.
Almost all existing approaches (e.g., deterministic or statistical, model- or non-model-based in terms of the decision process) exploit a large set of assumptions to limit complexity. An example of a color-based method relies on heuristic assumptions made about the color characteristics of shadows. At the same time, these approaches are highly scene-dependent because they rely on carefully selected parameters either in the shadow model or the decision threshold. This dependency limits their accuracy and extensibility to different types of scenarios. Furthermore, most existing shadow detection algorithms are error-prone because they operate on, and make pixel-wise decisions for, each pixel in an originally detected foreground mask, which can be time-consuming.
Little work has been done to address false foreground detection caused by headlights during nighttime. In one existing approach, inside a lane, the pixels corresponding to a line perpendicular to the lane are summed, and if all non-background pixels correspond to the highlight category, then those pixels are set to background. However, unlike the present disclosure, the removal of headlight reflection in this approach depends on fixed lane geometry.
In an existing on-board system that uses a camera mounted to a vehicle, headlight reflection is fixed within the camera field of view and, is thus, easily removed. However, this approach is not applicable in a setting where the reflectance area moves with respect to the camera field of view.
In a different, luminance-based, approach, headlight reflection is removed by adjusting light properties, such as, luminosity, contrast, and intensity, to eliminate diffused reflections on the road. The headlight is then used to detect the vehicle by grey value. However, this approach is insufficient for detecting vehicles viewed at arbitrary angles in an environment without regularized motion patterns.
Other existing approaches use an exponential model and a Reflection Intensity Map and a Reflection Suppression Map to track and pair headlights with vehicles. These approaches are complicated.
A simpler method and system is desired for accurately detecting a vehicle in the presence of shadows and headlight reflection.