This invention relates generally to the extraction of objects from source video, and more particularly to such extraction that is substantially automatic in nature.
An increasingly common use of computers and computerized devices is the processing of video, such as video captured in real-time, or video captured or otherwise input from a storage, such as a hard disk drive, a digital video disc (DVD), a video cassette recorder (VCR) tape, etc. For the processing of video, objects within the video usually need to be extracted. Objects can correspond to, for example, semantic objects, which are objects as defined perceptually by the viewer. For example, a video of a baseball game may have as its objects the various players on the field, the baseball after it is thrown or hit, etc. Object extraction is useful for object-based coding techniques, such as MPEG-4, as known within the art; for content-based visual database query and indexing applications, such as MPEG-7, as also known within the art; for the processing of objects in video sequences; etc.
Prior art object extraction techniques generally fall into one of two categories: automatic extraction and semi-automatic extraction. Automatic extraction is relatively easy for the end user to perform, since he or she needs to provide little or no input for the objects to be extracted. Automatic extraction is also useful in real-time processing of video, where user input cannot be feasibly provided in real time. The primary disadvantage to automatic extraction, however, is that as performed within the prior art the objects are not defined precisely. That is, only rough contours of objects are identified. For example, parts of the background may be included in the definition of a given object.
Conversely, semi-automatic object extraction from video requires user input. Such user input can provide the exact contours of objects, for example, so that the objects are defined more precisely as compared to prior art automatic object extraction. The disadvantage to semi-automatic extraction, however, is that user input is in fact necessary. For the lay user, this may be at best inconvenient, and at most infeasible in the case where the user is not proficient in video applications and does not know how to provide the necessary optimal input. Furthermore, semi-automatic extraction is ill-suited for real-time processing of video, even where a user is proficient, since typically the user cannot identify objects in real time.
Therefore, there is a need to combine the advantages of automatic and semi-automatic video object extraction techniques. That is, there is a need to combine the advantageous precise definitions afforded objects by semi-automatic techniques, with the advantageous ability to perform the object extraction in real-time, as is allowed with automatic techniques. For these and other reasons, there is a need for the present invention.
The invention relates to automatic video object extraction. In one embodiment, color segmentation and motion segmentation are performed on a source video. The color segmentation segments the video by substantially uniform color regions thereof. The motion segmentation segments the video by moving regions thereof The color regions and the moving regions, referred to as masks in one embodiment of the invention, are then combined to define the video objects.
Embodiments of the invention provide for advantages not found within the prior art. Specifically, at least some embodiments of the invention provide for object extraction from video in a substantially automatic manner, while resulting in objects that are substantially precisely defined. The motion segmentation mask defines the basic contours of the objects, while the color segmentation mask provides for more precise boundaries of these basic contours. Thus, combined, the motion and color segmentation masks allow for video object extraction that is substantially automatic, but which still yields substantially precisely defined objects.
The invention includes computer-implemented methods, machine-readable media, computerized systems, and computers of varying scopes. Other aspects, embodiments and advantages of the invention, beyond those described here, will become apparent by reading the detailed description and with reference to the drawings.