1. Field of the Invention
The present invention relates to streaming data, and more specifically, to monitoring, identifying, indexing, presenting, and viewing relevant portions of streaming data.
2. Description of the Related Art
Using graphs to represent the content of text data has provided a quick means of viewing document characteristics. These visual representations can be time saving in determining which documents to examine more thoroughly.
An article by Eick called, “SeeSoft-A Tool for visualizing Line Oriented Software Statistics”, IEEE Transactions on Software Engineering, p 957-968, November, 1992 describes use of a color rectangle or a pixel as a visual indication of changes made in a source file. Each line in a program's source file is represented as a rectangle or pixel. The color of the pixel or rectangle is determined by the length of time since the line was modified. Thus files, which have been recently modified, can be visually identified. In addition the extent of the modifications can also be determined by viewing the graph.
Automatic search of text documents has become common. Search engines display documents satisfying search criteria in a rank ordering. The user then scans the text document to determine how relevant the text is to their query. In a paper by Hearst entitled “Tile Bars: Visualization of Term Distribution Information in Full Text Information Access”, Proceeding of the ACM SIGCHI Conference on Human Factors in Computing Systems(CHI), p 59-66, May 1995, describes the use of title bars to indicate search relevance.
Referring to FIG. 1, a title bar 100 is illustrated. The bar 100 is composed of a rectangle 102 whose length corresponds to the length of a document. The text is segmented into either paragraphs or some other granularity. A text box 110 graphically represents each text segment. The shade of a box indicates the number of times a search term occurs in the text segment. Eight shades of gray are typically used, where the darkest is many occurrences of the search term and light is few occurrences of the search term.
There may be several rows 120 of text boxes with each row representing a search term. This permits users to determine when search terms overlap in the document. The row 120 of text boxes 110 for a search term are displayed horizontally across the rectangle. The shade of the text box 110 indicates the number of occurrences the search term was found in the current text segment. If more than one search term is used, additional rows 120 of text boxes 110 are displayed for each search term.
Streaming media can be audio, video, or audio and video. The operative word is streaming. The data is to be viewed sequentially. Future portions of the data may not be available to the client viewer at the time the current portion is being viewed. Future individual video frames could be displayed, but the concept of a frame of audio does not exist. Some portion of the audio must be played in order to determine its content. U.S. Pat. No. 6,597,859 to Leinhart describes dividing a video stream into shots, segment based scene changes, and writing an abstract for each of the shots. U.S. Pat. No. 6,219,837 to Yeo provides summary frames on the bottom of a playing video, to summarize what has previously occurred in the video. This enables someone just beginning to watch the video to understand what previously occurred in the current program. U.S. Pat. No. 6,463,444 to Jain proposes a metadata file to index and retrieve encoded video.
MPEG-7 standard is an implementation of a metadata mechanism for annotating streaming media contents and the time ranges of when the annotations occur. The MPEG-7 file is an XML file, which can be used with the associated media file to position to an object or event. Streaming data can be manually or automatically annotated with the results saved in an MPEG-7 file.
Referring to FIG. 2, samples of MPEG-7 annotations of audio and video segments within a streaming data file are illustratively shown. The key components for searching stream data files are the FreeTextAnnotation node which contains the event description, the TextAnnotation node that contains the confidence level that the description is present, and the MediaTimeNode which is the starting time point and the duration for this description.
FIG. 2 is an example of a portion of an MPEG-7 data file demonstrating a video annotation, lines 200 through 210, and an audio annotation, lines 211 through 220. The data type of the annotation is described in lines 200 video and 211 audio. The confidence level of the annotation is contained in lines 201 and 212. The annotation's description is contained on lines 203 and 214. The time when the annotation occurs within the streaming data is contained in lines 206 through 209 for the video annotation and lines 216 through 219 for the audio annotation. Lines 207 and 217 identify when the annotation starts. Lines 208 and 218 indicate the duration of the annotation.
U.S. Pat. No. 6,567,980 to Jain describes a metadata video catalog with hyperlinks to a mechanism to display frames. A specific implementation of Jain's method would be to search MPEG-7 files for terms similar to text documents are searched today. Thus, a user will be presented with a list of streaming data files, which satisfied his search criteria. The user could then request the playing of these files.
The prior art does not provide for the scanning of the relevance of search terms into the future for streaming data. Postage stamp images can be displayed showing past shots since they can be captured at the time of display. However, displaying future shots can only be done using two streams of video data. The first stream would be the primary display. The second would be used for sample images of future shots. One difficulty with this method is that video streams require considerable bandwidth for transmission and the real time decoding of compressed the video taxes processor resources. Doing twice the work is not practical. In addition, displaying future frames does not handle the problem of display audio relevance to the search terms.
The prior art permits repositioning in text by clicking some user interface object to either move to the next page or the next occurrence of the search term. The common method of repositioning in audio/video is moving a sliding control to indicate where play should commence within the media. Another method is clicking on a shot frame, if present, and repositioning to that shot in the video stream. One disadvantage of this includes that the temporal granularity is very course.