The present invention relates to a method, system and apparatus for linking descriptive information, or metadata, to identified objects within a time-sequential digital signal.
It will be appreciated that the wordxe2x80x9cmetadataxe2x80x9d used throughout this document is to be construed broadly as data associated with other data, unless a contrary meaning is clearly intended in a particular case. For example, one or more video frames representing a sequence of a person (in the form of object data) walking across a frame can have metadata associated with it. The metadata can take the form of additional data which in some way describes an attribute or content of the video frame or frames. For example, the metadata can relate to information such as the colour of the person""s clothes, the person""s name (or age or other personal details), or can describe that the person is walking. Whilst metadata can include any form of additional data related to main data, it is preferred that the metadata be in some way descriptive of (or represent a description of) the main data.
As various team sports have become more professional, analysis of teams and individual players by coaches has grown in importance. To this end, coaches and players of a particular team often review video footage of past matches to look for identifiable errors or weaknesses in the team""s strategy or game play with a view to rectifying any detected deficiencies through remedial training. Alternatively, or in addition, the movements and team play of opposing teams can be studied in an attempt to identify weaknesses which can be exploited by selection of appropriate game plans.
In the past, such analysis has been done on a relatively ad hoc basis, with coaches typically fast forwarding through video footage of one or more recorded matches. Players are identified by the coach, and manual notes taken on particular aspects of their performance. However, attempting to find the actions of a particular player from the coach""s team or an opponent""s team is labour intensive, particularly where multiple games need to be viewed.
One solution to this has been to carefully observe each video as it becomes available, and catalogue the appearance of each player and perhaps the action being taken by the player at each appearance. Each player""s entry to the field of view in the video is recorded as either a time or frame number with respect to the video tape, and can be accessed later by going directly to the correct point on the video tape. By cataloguing such information in a computer database, it is conceivable that a computer search for a particular player could yield a list of potential points of interest, perhaps spanning a number of video-recorded matches. However, this method is still relatively labour intensive, cumbersome and time consuming. Furthermore, the information required to fill such a database can only be generated off-line after a match, and is not available in real time.
It is an object of the present invention to overcome or at least substantially ameliorate one or more of the disadvantages of the prior art.
Accordingly, in a first aspect, the present invention provides a method of generating a metadata object having links to temporal and spatial extents in a time-sequential digital signal, the method including the steps of:
identifying an object of interest in the time-sequential digital signal;
defining a link entity between metadata in the metadata object and the identified object, the link entity forming part of the metadata object;
tracking the identified object in the time-sequential digital signal and updating the link entity in the metadata object to include the identified object""s new temporal and spatial extent in the time-sequential digital signal; and
associating the generated metadata object with the time-sequential digital signal.
Preferably, the time-sequential digital signal defines a series of video frames and the object is identified on the basis of movement against a relatively stationary background in the frames. More preferably, the object is identified by comparing two or more relatively closely temporally spaced video frames from the series of video frames.
Desirably, the object is tracked by maintaining position information related to its position in each frame. Preferably the position information is updated for each frame.
In a preferred embodiment, the method further includes the steps of:
providing predetermined identification information related to the predetermined metadata and one or more objects likely to be identified in the time-sequential digital signal;
attempting to identify the identified object with reference to the identification information; and
in the event that an object is identified, including the identification information in the metadata which is linked to the object.
In a second aspect, the present invention provides a system for generating a metadata object having links to temporal and spatial extents in a time-sequential digital video signal defining a series of frames, the system including:
a video source including means for generating the time-sequential digital video signal defining a series of frames;
image processing means for identifying an object of interest having temporal and spatial extents within one or more frames in the digital video signal;
linking means for defining a link entity between the relevant metadata in the metadata object and each identified object, wherein the image processing means are configured to track the object during subsequent frames in the series, whilst the linking means maintains the link entity between the object in each frame and the metadata in the metadata object.
Preferably, the system further includes:
storage means to store predetermined identification information related to one or more classes of objects likely to be identified in the time-sequential digital video signal; and
identification means for using the predetermined identification information to recognise the object, whereby, upon recognition of an object, metadata corresponding specifically to that object is linked by a link entity thereto.
Desirably, the video source is a video camera. Preferably, the video camera includes position detection means for generating a movement signal indicative of relative panning or zooming movements of the video camera.
In a third aspect, the present invention provides an apparatus for generating a metadata object having links to temporal and spatial extents in a time-sequential digital video signal defining a series of frames, the apparatus including:
video source including means for generating the time-sequential digital video signal defining a series of frames;
image processing means for identifying an object of interest having temporal and spatial extents within one or more frames in the digital video signal;
link entity means for defining a link entity between each object and the metadata object, wherein the image processing means are configured to track the object during subsequent frames in the series, whilst the link entity means maintains the link entity between the object in each frame and the metadata object.
In a fourth aspect, the present invention provides a computer programming product with a computer readable medium recorded thereon for generating a metadata object having links to temporal and spatial extents in a time-sequential digital signal, said computer programming product including:
identifying module for identifying an object of interest in the time-sequential digital signal;
defining module for defining a link entity between metadata in the metadata object and the identified object, the link entity forming part of the metadata object;
tracking module for tracking the identified object in the time-sequential digital signal and updating the link entity in the metadata object to include the identified object""s new temporal and spatial extent in the time-sequential digital signal; and
associating module for associating the generated metadata object with the time-sequential digital signal.
In a fifth aspect, the present invention provides a method of linking predetermined metadata with a time sequential digital signal defining a series of frames, the method including the steps of:
utilising the detected difference between two or more relatively closely spaced frames in the series to detect an object in the form of a coherent motion block moving relative to a background in the frames;
defining a link entity between the object and the metadata; and
tracking the object during subsequent frames in the series, whilst maintaining the link entity between the object in each frame and the predetermined metadata.
In a sixth aspect, the present invention provides a system for linking metadata with a time-sequential digital video signal defining a series of frames, the system including:
a video source including means for generating the time-sequential digital video signal defining the series of frames;
image processing means for utilising a detected difference between two or more relatively closely spaced frames in a series to detect an object in the form of a coherent motion block moving relative to a background in the frames; and
link entity means for defining a link entity between the object and the metadata, wherein the image processing means are configured to track the object during subsequent frames in the series, whilst the link entity means maintains the link entity between the object in each frame and the metadata.
In a seventh aspect, the present invention provides a method of isolating and tracking predetermined objects in a time-sequential digital signal defined by a series of video frames, the method including the steps of:
determining an object motion field of a frame relative to a background thereof, the motion field being characterised by a plurality of motion indicators, each of which represents a motion of a spatial region of a plurality of regions of the digital image;
grouping relatively closely adjacent regions having corresponding motion indicators within a predetermined threshold range of values into one or more object regions; and
tracking each object region during subsequent video frames of the series.
In a eighth aspect, the present invention provides a method of tracking objects in a time-sequential digital signal originally captured by a capture device, the method including the steps of:
determining a motion vector field for selected time instances of the time-sequential digital signal;
removing components arising from motion of the capture device during capture from the motion vector field, thereby to generate an object motion field;
identifying regions of coherent motion in the object motion field, thereby to identify corresponding moving objects;
selecting one or more of the moving objects; and
applying an image processing tracking method to each selected object during subsequent time instances of the time-sequential digital signal.
In a ninth aspect, the present invention provides An apparatus for linking metadata with a time-sequential digital video signal defining a series of frames, the apparatus including:
a video source including means for generating the time-sequential digital video signal defining the series of frames;
image processing means for utilising a detected difference between two or more relatively closely spaced frames in a series to detect an object in the form of a coherent motion block moving relative to a background in the frames; and
link entity means for defining a link entity between the object and the metadata, wherein the image processing means are configured to track the object during subsequent frames in the series, whilst the link entity means maintains the link entity between the object in each frame and the metadata.
In a tenth aspect, the present invention provides a computer programming product with a computer readable medium recorded thereon for linking predetermined metadata with a time sequential digital signal defining a series of frames, said computer programming product including:
utilising module for utilising the detected difference between two or more relatively closely spaced frames in the series to detect an object in the form of a coherent motion block moving relative to a background in the frames;
defining module for defining a link entity between the object and the metadata; and
tracking module for tracking the object during subsequent frames in the series, whilst maintaining the link entity between the object in each frame and the predetermined metadata.