The present invention relates to image processing of video signals from natural scenes and concerns, in particular, removal of shadows in images carried by such video signals.
There are many situations in which video images are taken through a video camera of a natural scene in which there is a fixed and defined region of interest (ROI). For example, in the case of televised sporting events, the playing field is the region of interest, in which most of the interesting action takes place (though in the context of the present invention the ROI is not necessarily limited to the playing field, but may include, for example surrounding areas and structures). Another example is a surveillance system monitoring a scene with a defined region of interest. For clarity and compactness, the present disclosure describes the invention as applied to a sporting event but this is not to detract from its more general applicability. In view of this, the terms “region of interest” and “field” will be used interchangeably. It is noted that a region of interest is not necessarily a single contiguous area, but may generally consist also of a plurality of disjoint areas or may even include the entire scene viewed by the camera, and should thus be understood in the context of the present invention.
When a sporting event occurs outdoors in daytime, there are shadows cast naturally on the field surface by fixed objects in, and around, the field, as well as by moving objects—primarily the players. Because of the relatively limited dynamic range of perceivable brightness values in a video image, such shadows appear more pronounced than, say, to a viewer present at the scene and are thus annoying. Moreover, in areas of the field that are in the shadow of a large object (such as a large sign board or a gallery structure), important details, such as a rolling ball or the identity of players, become less visible. It would therefore be desirable to remove shadows, or at least reduce their effect, within the ROI in video images of such scenes.
British patent GB2341507 discloses a system for removing shadows from a television signal in real time. In this system each video frame is processed so as to extract mask areas that correspond to shadow, the masks having certain geometric constraints, and then pixel values that correspond to mask areas are modified to increase brightness. Also other shadow removal systems and methods have been disclosed, using a similar approach.
All disclosed systems and methods have, in common, several drawbacks, which include:
(a) Extracting a mask from each frame is inefficient in terms of processing power, thus either requiring high processing power, which is expensive, or limiting the processing to relatively unsophisticated methods.
(b) Due to inherent noise in a typical video signal, extracting a mask from each frame may result in randomly defined edges, which may cause visible artifacts in the resulting video image.
(c) Moving figures (such as players) may interfere with the mask generation process, thus leading to artifacts in the resulting video image.
(d) There is no reference to any known lighting direction in defining the mask areas, thus increasing the likelihood of false shadow identification; it is, furthermore, difficult to apply such reference when extracting masks on a frame-by-frame basis, especially if the camera's pointing direction is variable.
There is thus a clear need for a method and system for removing shadows from video images in real time, wherein the processing is efficient and the resulting effect is substantially accurate and devoid of artifacts—even in the presence of moving figures and with variable camera pointing direction.